TITLE OF THE INVENTION 
IMAGE RECOGNITION METHOD AND APPARATUS UTILIZING EDGE 
DETECTION BASED ON MAGNITUDES OF COLOR VECTORS 
EXPRESSING COLOR ATTRIBUTES OF RESPECTIVE PIXELS OF 
5 COLOR IMAGE 

BACKGROUND OF THE INVENTION 
Field of Application 

The present invention relates to an image 
recognition method and an image recognition apparatus 
. 10 for use in an image recognition system, for extracting 
trom a color image the shapes of objects which are to 
be recognized. In particular, the invention relateb Lo 
an image recognition apparatus which provides a 
substantial improvement in edge detection performance 

15 when applied to images such as aerial photographs or 

satellite images which exhibit a relatively low degree 
of variation in intensity values. 
Description of Prior Art 

In the prior art, various types of image 

20 recognition apparatus are known, which are intended for 
various different fields of application. Typically, 
the image recognition apparatus may be required to 
extract from an image, such as a photograph, all 
objects having a shape which falls within some 

25 predetermined category. 
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One approach to the problem of increasing the 
accuracy of image recognition of the contents of 
photographs is to set the camera which takes the 
photographs in a fixed position and to fix the lighting 
5 conditions etc., so that the photographic conditions 
are always identical. Another approach is to attach 
markers, etc., to the objects which are to be 
recognized. 

However in the case of recognizing shapes within 

10 satellite images or aerial photographs, such prior art 
methods* ul improving accuracy cannot be applied. That 
is to say r the photographic conditions such as the 
camera position, camera orientation, weather 
conditions, etc., will vary each time that a photograph 

15 is taken. Furthermore, a single image may contain many 
categories of image data, such as image data 
corresponding to building, rivers, streets, etc., so 
that the image contents are complex. As a result, the 
application of image recognition to satellite images or 

20 aerial photographs is extremely difficult. 

To extract the shapes of objects which are to be 
recognized, from the contents of an image, image 
processing to detect edges etc., can be implemented by 
using the differences between color values (typically, 

25 the intensity, i.e., gray-scale values) of the pixels 



which constitute a region representing an object which 
is to be recognized and the color values of the pixels 
which constitute adjacent regions to these objects. 
Edge detection processing consists of detecting 
5 positions at which there are abrupt changes in the 
pixel values, and recognizing such positions as 
corresponding to the outlines of physical objects* 
Various types of edge detection processing are known. 
With a typical method, smoothing processing is applied 

10 overall to the pixel values, then each of the pixels 
Tor which the first; derivative of the intensity 
variation gradient within the image reaches a local 
maximum and exceeds a predetermined threshold value are 
determined, with each such pixel being assumed to be 

15 located on an edge of an object in the image. 

Alternatively, a "zero-crossing" method can be applied, 
e.g., whereby the zero crossings of the second 
derivative of the gradient are be detected to obtain 
the locations of the edge pixels. With a template 

20 technique, predetermined shape templates are compared 
with the image contents to find the approximate 
positions of objects that are to be recognized, then 
edge detection processing may be applied to the results 
obtained. 
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Although prior art image recognition techniques 
are generally based upon intensity values of the pixels 
of an image, various methods are possible for 
expressing the pixel values of color image data. If 
5 the HSI (hue, saturation, intensity) color space is 

used, then any pixel can be specified in terms of the 
magnitude of its hue, saturation or intensity 
component. The RGB (red, green, blue) method is 
widely used for expressing image data, however 

10 transform processing can be applied to convert such 
JaLd Lu HSI form, ana edge detection processing can 
then be applied by operating on the intensity values 
which are thereby obtained, HSI information has the 
advantage of being readily comprehended by a human 

15 operator. In particular, an image can easily be judged 
by a human operator as having a relatively high or 
relatively low degree of variation in intensity (i.e., 
high contrast or low contrast) . 

Due to the difficulties which are experienced in 

20 the practical application of image recognition 

processing to satellite images or aerial photographs, 
it would be desirable to effectively utilize all of the 
color information that is available within such a 
photograph, that is to say, to use not only the 

25 intensity values of the image but also the hue and 
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saturation information contained in the image. However 
in general with prior art types of edge detection 
processing, only parts of the color information, such 
as the intensity values alone, are utilized. 

A method of edge detection processing is described 
in Japanese patent HEI 6-83962, which uses a zero- 
crossing method and, employing a HSI color space 
(referred to therein using the designations 
L, *C*ab,H*ab for the intensity, saturation and hue 
values respectively) attempts to utilize not only the 
inteii^iLy values out: also hue and saturation 
information. Tn Fig. 47, diagrams 200, 201, 202, ana 
2 03 show respective examples of the results of image 
recognition, applied to a color picture of an 
15 individual, which are obtained by using that method. 
Diagram 200 shows the result of edge detection 
processing that is applied using only the intensity 
values of each of the pixels of the original picture, 
diagram 2 01 shows the result of edge* detection 
20 processing that is applied using only the hue values, 

and diagram 2 02 shows the result obtained by using only 
the saturation values. Diagram 2 03 shows the result 
that is obtained by combining the results shown in 
diagrams 2 00, 2 01 and 2 03. As can be seen, a 
25 substantial amount of noise arises in the image 
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expressed by the saturation values, and this noise is 
inserted into the combined image shown in diagram 2 03. 

In some cases, image smoothing processing is 
applied in order to reduce the amount of noise within 
5 an image, before performing edge detection processing, 
i.e., the image is pre-processed by using a smoothing 
filter to blur the image, and edge detection processing 
applied to the resultant image. 

In order to obtain satisfactory results from edge 

10 detection processing which is to be applied to an image 
such as d ticiLfciliite images or aerial photograph, for 
example to accurately and reliably extract the shapes 
of specific objects such as roads, buildings etc., from 
the image contents, it is necessary not only to 

15 determine the degree of "strength" of each edge, but 

also the direction along which an edge is oriented. In 
the following, and in the description of embodiments of 
the invention and in the appended claims, the term 
"edge" is used in the sense of a line segment which is 

20 used as a straight-line approximation to a part of a 

boundary between two adjacent regions of a color image. 
The term "strength" of an edge is used herein to 
signify a degree of of color difference between pixels 
located adjacent to one side of that edge and pixels 

25 located adjacent to the opposite side, while the term 



"edge direction" is used in referring to the angle of 
orientation of an edge within the image, which is one 
of a predetermined limited number of angles. If the 
direction of an edge could be accurately determined 
5 (i.e., based upon only a part of the pixels which 

constitute that edge) , then this would greatly simplify 
the process of determining all of the pixels which are 
located along that edge. That is to say, if the edge 
direction could be reliably determined estimated by 

10 using only a part of the pixels located on that edge, 
then 1 L would be possible to compensate for any 
discontinuities within the edge which is obtained as a 
result of the edge detection processing, so that an 
output image could be generated in which all edges are 

15 accurately shown as continuous lines. 

However with the method described in Japanese 
patent HEI 6-83962, only the zero-crossing method is 
used, so that it is not possible to determine edge 
directions, since only each local maximum of variation 

20 of a gradient of a color attribute is detected, 
irrespective of the direction along which that 
variation is oriented. With other types of edge 
detection processing such as the object template 
method, processing of intensity values, hue values and 

25 saturation values can be performed respectively 
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separately, to obtain respective edge directions. 
However even if the results thus obtained are combined , 
accurate edge directions cannot be detected. 
Specifically, the edge directions which result from 
5 using intensity values, hue values and saturation 

values may be entirely different from one another, so 
that accurate edge detection cannot be achieved by 
taking the average of these results. 

Moreover, in the case of a color image such as a 

10 satellite image or aerial photograph which presents 

ttpeuiai difficulties with respect to image recognition, 
it would be desirable to be able to flexibly adjust the 
image recognition processing in accordance with the 
overall color characteristics of the image that is to 

15 be processed. That is to say, it should be possible 
for example for a human operator to examine such an 
image prior to executing image recognition processing, 
to estimate whether different objects in the image 
mainly differ mainly with respect to differences in 

20 hue, or whether the objects are mainly distinguished by 
differences in gray-scale level, i.e., intensity 
values. The operator should then be able to adjust the 
image recognition apparatus to operate in a manner that 
is best suited to these image characteristics, i.e., to 

25 extract the edges of objects based on the entire color 
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information of the image , but for example placing 
emphasis upon the intensity values of pixels, or upon 
the chrominance values of the pixels, whichever is 
appropriate. However such a type of image recognition 
5 apparatus has not been available in the prior art. 

Furthermore, in order to apply image recognition 
processing to an image whose color data are expressed 
with respect to an RGB color space, it is common 
practice to first convert the color image data to a an 
10 HSI (hue, saturation, intensity) color space, i.e., 

expressing -che data of each pixel as a position within 
such a color space. This enables a human operator to 
more readily judge the color attributes of the overall 
image prior to executing the image recognition 
15 processing, and enables such processing to be applied 
to only the a specific color attribute of each of the 
pixels, such as the intensity or the saturation 
attribute. However if processing is applied to RGB 
data which contain some degree of scattering of the 
20 color values, and a transform from RGB to HSI color 
space is executed, then the resultant values of 
saturation will be unstable (i.e., will tend to vary 
randomly with respect to the correct values) within 
those regions of the image in which the intensity 
25 values are high, and also within those regions of the 
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image in which the intensity values are low. For 
example , assuming that each of the red, green and blue 
values of each pixel is expressed by 8 bits, so that 
the range of values is from 0 to 2 55, then in the case 
of a region of the image in which the intensity values 
are low, if any of the red, green or blue values of a 
pixel within that region should increase by 1, this 
will result in a large change in the corresponding 
value of saturation that is obtained by the transform 
processing operation. Instability of the saturation 
values wii± oe expressed as noise, i.e., spurious edge 
portions, in the results of sdge detection processing 
which utilizes these values. For that reason it has 
been difficult in the prior art to utilize the color 
saturation information contained in a color image, in 
image recognition processing. 

Furthermore if a substantial degree of smoothing 
processing is applied to an image which is to be 
subjected to image recognition, in order to suppress 
the occurrence of such noise, then this has the effect 
of blurring the image, causing rounding of the shapes 
of edges and also merging together any edges which are 
located closely mutually adjacent. As a result, the 
accuracy of extracting edge information will be 
reduced. Conversely, if only a moderate degree of 
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smoothing processing is applied to the image that is to 
be subjected to image recognition, or if smoothing 
processing is not applied to the image, then the 
accuracy of extraction of shapes from the image will be 
high, but there will be a high level of noise in the 
results so that reliable extraction of the shapes of 
the required objects will be difficult to achieve. 

Moreover in the prior art, there has been no 
simple and effective method of performing image 
recognition processing to extract the shapes of objects 
which are to fce recognized, which will eliminate 
various small objects in the image that care not 
intended to be recognized (and therefore can be 
considered to constitute noise) without distorting the 
15 shapes of the objects which are to be recognized. 

SUMMARY OF THE INVENTION 
It is an objective of the present invention to 
overcome the disadvantages of the prior art set out 
above, by providing an image recognition method and 
20 image recognition apparatus whereby edge detection for 
extracting the outlines of objects appearing in a color 
image can be performed by utilizing all of the color 
information of the pixels of the color image, to 
thereby achieve a substantially higher degree of 
25 reliability of detecting those pixels which constitute 
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edges of objects that are to be recognized than has 
been possible in the prior art, and furthermore to 
provide an image recognition method and apparatus 
whereby, when such an edge pixel is detected, the 
5 direction of the corresponding edge can also be 
detected. 

It is a further objective of the invention to 
provide an image recognition method and image 
recognition apparatus whereby processing to extract the 

10 shapes of objects which are to be recognized can be 

performed such as to eliminate the respective shapes of 
small objects that are not intended Lo be recognized, 
without distorting the shapes of the objects which are 
to be recognized. 

15 To achieve the above objectives, the invention 

provides an image recognition method and apparatus 
whereby, as opposed to prior art methods which are 
based only upon intensity values, i.e., the gray-scale 
values of the pixels of a color image that is to be 

20 subjected to image recognition processing, 

substantially all of the color information (intensity, 
hue and saturation information) contained in the color 
image can be utilized for detecting the edges of 
objects which are to be recognized. This is basically 

25 achieved by successively selecting each pixel to be 
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processed, i.e., as the object pixel, and determining, 
for each of a plurality of possible edge directions, a 
vector referred to as an edge vector whose modulus 
indicates an amount of color difference between two 
5 sets of pixels which are located on opposing sides of 
the object pixel with respect to that edge direction* 
The moduli of the resultant set of edge vectors are 
then compared, and the edge vector having the largest 
modulus is then assumed to correspond to the most 

10 likely edge on which the object pixel may be located. 
TliciL largest value or edge vector modulus is referred 
to as the "edge strength" of the object pixel, and the 
direction corresponding to that edge vector is assumed 
to be the most likely direction of an edge on which the 

15 object pixel may be located, i.e., a presumptive edge 
for that pixel. Subsequently, it is judged that the 
object pixel is actually located on its presumptive 
edge if it satisifes the conditions that: 

(a) its edge strength exceeds a predetermined 
20 minimum threshold value, and 

(b) its edge strength is greater than the 
respective edge strength values of the two pixels which 
are located immediately adjacent to it, on opposing 
sides with respect to the direction of that presumptive 

25 edge. 
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The above processing can be achieved in a simple 
manner by predetermining only a limited number of 
possible edge directions which can be recognized, e.g., 
0 degrees (horizontal) , 90 degrees (vertical) , 45 
5 degrees diagonal and -45 degrees diagonal. With the 

preferred embodiments of the invention, a set of arrays 
of numeric values referred to as edge templates are 
utilized, with each edge template corresponding to a 
specific one of the predetermined edge directions, and 

!0 with the values thereof predetermined such that when 

the color vecrors ot an array of pixels centered on the 
object- pixel are subjected to array multiplication by 
an edge template, the edge vector corresponding to the 
direction of that edge template will be obtained as the 

15 vector sum of the result. The respective moduli of the 
edge vectors thereby derived for each of the possible 
edge directions are then compared, to find the largest 
of these moduli, as the edge strength of the object 
pixel. 

20 in that way, since all of the color information 

contained in the image can be utilized to perform edge 
detection, the detection can be more accurately and 
reliably performed than has been possible in the prior 
art. 
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According to another aspect of the invention, data 
expressing the color attributes of pixels of a color 
image which is to be subjected to edge detection 
processing are first subjected to transform processing 
5 to express the color attributes of the pixels of the 
image as respective sets of coordinates of an 
appropriate color space, in particular, a color space 
in which intensity and chrominance information are 
expressed by separate coordinates. This enables the 

10 color attribute information to be modified prior to 
perf oi-mixiy edge detection, such as to optimize the 
results that win be obtained in accordance with the 
characteristics of the particular color image that is 
being processed. That is to say, the relative amount 

15 of contribution of the intensity values to the 

magnitudes of the aforementioned color vectors can be 
increased, for example. If the color attributes are 
first transformed into a HSI (hue, saturation, 
intensity) color space, then since such HSI values are 

20 generally expressed in polar coordinates, a simple 

conversion operation is applied to each set of h, s, i 
values of each pixel to express the color attributes as 
a color vector of an orthogonal color space in which 
saturation information and chrominance information are 

25 expressed along respectively different coordinate axes, 
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i.e. to express the pixel color attributes as a 
plurality of linear coordinates of that color space, 
and the edge detection processing is then executed. 

It is known that when image data are transformed 
5 from a form such as RGB color values into an HSI color 
space, instability (i.e., random large-scale 
variations) may occur in the saturation values which 
are obtained as a result of the transform. This 
instability of saturation values is most prevalent in 
10 those regions of a color image where the intensity 

values are exceptionally low, and also in those regions 
wh^re the intensity values are exceptionally high. 
This is a characteristic feature of such a transform 
operation, and causes noise to appear in the results of 
15 edge detection that is applied to such HSI-transf ormed 
image data and utilizes the saturation information, due 
to the detection of spurious edge portions as a result 
of abrupt changes in saturation values between adjacent 
pixels. However with the present invention, such 
20 instability of the saturation values can be reduced, by 
modifying the saturation values obtained for respective 
pixels in accordance with the magnitudes of the 
intensity values which are derived for these pixels. 
The noise which would otherwise be generated by such 
25 instability of saturation values can thereby be 
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suppressed, enabling more reliable recognition of 
objects in the color image to be achieved. 

According to one aspect of the invention, when a 
transform into coordinates of the HSI space has been 
5 executed, such reduction of instability of the 

saturation values is then achieved by decreasing the 
saturation values in direct proportion to amounts of 
decrease in the intensity values. Alternatively, that 
effect is achieved by decreasing the saturation values 
10 in direct proportion to decreases in the intensity 
values from a median value of intensity towards a 
minimum value (i.e., black) and also decreasing the 
saturation values in direct proportion to increases in 
the intensity values from that median value towards a 
15 maximum value (i.e., white). 

According to another aspect of the invention, when 
a transform into coordinates of the HSI space has been 
executed, such reduction of instability of the 
saturation values is then achieved by utilizing a 
20 predetermined saturation value modification function 
(which varies in a predetermined manner in accordance 
with values of intensity) to modify the saturation 
values. In the case of a transform from the RGB color 
space to the HSI color space, that saturation value 
25 modification function is preferably derived based on 
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calculating, for each of the sets of r, g, b values 
expressing respective points in the RGB color space, 
the amount of actual change which occurs in the 
saturation value s of the corresponding HSI set of 
5 transformed h, s, i values in response to a small-scale 
change in one of that set of r, g, b values. In that 
way, a saturation value modification function can be 
derived which is based on the actual relationship 
between transformed intensity values and instability of 

10 the corresponding saturation values, and can thus be 
used sucn as to maintain the saturation values 
throughout a color image at a substantially constant 
level, i.e., by varying the saturation values in 
accordance with the intensity values such as to 

15 appropriately compensate in those regions of the color 
space in which instability of the saturation values can 
occur. 

Noise in the edge detection results, caused by 
detection of spurious edge portions, can be thereby 
20 very effectively suppressed, enabling accurate edge 
detection to be achieved. 

According to another aspect, the invention 
provides an image recognition method and apparatus for 
operating on a region image (i.e., an image formed of a 
25 plurality of regions expressing the shapes of various 
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objects, each region formed of a continuously extending 
set of pixels in which each pixel is identified by a 
label as being contained in that region) to process the 
region image such as to reduce the amount of noise 
5 caused by the presence of various small regions, which 
are not required to be recognized. This is achieved by 
detecting each small region having an area that is less 
than a predetermined threshold value, and combining 
each such small region with an immediately adjacent 

10 region, with the combining process being executed in 
accordance with specific rules which serve to prevent 
distortion of the shapes of objects that are to be 
recognized. These rules preferably stipulate that each 
of the small regions is to be combined with an 

15 immediately adjacent other region which (out of all of 
the regions immediately adjacent to that small region) 
has a maximum length of common boundary line with 
respect to that small region. In that way, regions 
are combined without consideration of the pixel values 

20 (of an original color image) within the regions and 

considering only the sizes and shapes of the regions, 
whereby it becomes possible to eliminate small regions 
which would constitute "image noise" , without reducing 
the accuracy of extracting the shapes of objects which 

25 are to be recognized. 
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The aforementioned rules for combining regions may 
further stipulate that the combining processing is to 
be executed repetitively, to operate successively on 
each of the regions which are below the aforementioned 
5 area size threshold value, starting from the smallest 

of these regions, then the next-smallest, and so on. It 
has been found that this provided even greater 
effectiveness in elimination of image noise, without 
reducing the accuracy of extracting the shapes of 

10 objects which are to be recognized. 

Alternatively, the region combining processing may 
Ho executed on the basis that the = f r»-«Yn«i-if •? 
for combining regions further stipulate that, for each 
of the small regions which are below the aforementioned 

15 area size threshold value, the total area of the 

regions immediately adjacent to that small region is to 
be calculated, and the aforementioned combining 
processing is then to be executed starting with the 
small region for which that adjacent area total is the 

20 largest, then the small region for which the adjacent 

area total is the next-largest, and so on in succession 
for all of these small regions. 

A region image, for applying such region combining 
processing, can for example be generated by first 

25 applying edge detection by an edge detection apparatus 



according to the present invention to an original color 
image , to obtain data expressing an edge image in which 
only the edges of objects appear, then defining each 
part of that edge image which is enclosed within a 
5 continuously extending edge as a separate region, and 
attaching a common identifier label to each of the 
pixels constituting that region. 

More specifically, the present invention provides 
an image recognition method for processing image data 

10 of a color image which is represented as respective 

sett, ul uuiur attribute values ot an array of pixels, 
to successively operat.p. on each of the pixels as an 
object pixel such as to determine whether that pixel is 
located on an edge within the color image, and thereby 

15 derive shape data expressing an edge image which shows 
only the outlines of objects appearing in the color 
image, with the method comprising steps of: 

if necessary, i.e., if the color attribute 
values of the pixels are not originally expressed as 

20 sets of coordinates of an orthogonal color space such 
as an RGB (red, green, blue) color space, expressing 
these sets of color attribute values as respective 
color vectors, with each color vector defined by a 
plurality of scalar values which are coordinates of an 

25 orthogonal color space; 



22 



for each of a plurality of predetermined edge 
directions , generating a corresponding edge template as 
an array of respectively predetermined numeric values; 

extracting an array of color vectors as respective 
5 color vectors of an array of pixels having the object 
pixel as the center pixel of that array; 

successively applying each of the edge templates 
to the array of color vectors in a predetermined array 
processing operation, to derive edge vectors 
10 respectively corresponding to the edge directions; 

comparing the respective moduli of the derived 
edge vectors to find the maximum modulus value, 
designating that maximum value as the edge strength of 
the object pixel and designating the edge direction 
15 corresponding to an edge vector having that maximum 
modulus as being a possible edge direction for the 
object pixel; and, 

judging whether the object pixel is located on an 
actual edge which is oriented in the possible edge 
20 direction, based upon comparing the edge strength of 
the object pixel with respective values of edge 
strength derived for pixels which are positioned 
immediately adjacent to the object pixel and are on 
mutually opposite sides of the object pixel with 
25 respect to the aforementioned possible edge direction- 
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The invention further provides an image 
recognition method for operating on shape data 
expressing an original region image, (i.e., an image in 
which pixels are assigned respective labels indicative 
5 of various image regions in which the pixels are 

located) to obtain shape data expressing a region image 
in which specific small regions appearing in the 
original region image have been eliminated, with the 
method comprising repetitive execution of a series of 

10 steps of: 

selectively determining respective regions of the 
original region image as constituting a set of small 
regions which are each to be subjected to a region 
combining operation; 

15 selecting one of the set of small regions as a 

next small region which is to be subjected to the 
region combining operation; 

for each of respective regions which are disposed 
immediately adjacent to the next small region, 

20 calculating a length of common boundary line with 

respect to the next small region, and determining one 
of the immediately adjacent regions which has a maximum 
value of the length of boundary line; and 
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combining the next small region with the 
adjacent region having the maximum length of common 
boundary line. 

Data expressing a region image, to be processed by 
5 the method set out above, can be reliably derived by 
converting an edge image which has been generated by 
the preceding method of the invention into a region 
image . 

The above features of the invention will be more 
10 clearly understood by referring to the following 

description of preferred embodiments of the invention 

BRIEF DESCRIPTION OF THE DRAWINGS 
Fig. 1 is a general system block diagram of a 
first embodiment of an image recognition apparatus 

15 

according to the present invention; 

Fig. 2 is a conceptual diagram showing an example 
of actual color attribute values of pixels in a color 
image, expressed in terms of an RGB color space; 

Fig. 3 illustrates an RGB color space; 

20 

Fig. 4 is a diagram illustrating and edge image 
obtained as a result of applying edge detection to a 
simplified color image; 

Fig. 5 is a basic flow diagram of the operation of 
the first embodiment; 

25 
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Figs. 6A to 6D are conceptual diagrams showing 
respective edge templates used with the first 
embodiment, and corresponding edge directions; 

Fig. 7 shows examples of a set of edge vectors; 

Fig. 8 is a diagram illustrating how one of the 
edge vectors of Fig. 7 defines the edge strength and 
possible edge direction for a pixel; 

Figs. 9A to 9D are conceptual diagrams for 
illustrating how the edge strength of an object pixel 
is compared with the respective edge strengths of 
pixels which are located adjacent thereto, on opposing 
sides with respect to an edge direction, for each of 
the possible edge directions; 

Fig. 10 is a diagram for use in describing how an 
edge image is obtained as a result of applying edge 
detection by the apparatus of the first embodiment to a 
simplified color image; 

Fig. 11 is a flow diagram showing details of 
processing to derive edge strength and possible edge 
direction information for each of the pixels of a color 
image in succession, with the first embodiment of the 
invention; 

Fig. 12 is a flow diagram showing details of 
processing, executed using the edge strength and edge 
direction information derived in the flow diagram of 
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Fig. 11, to determine those pixels of the color image 
which are located on actual edges; 

Figs. 13, 14 are flow diagrams showing alternative 
forms of the processing executed in the flow diagrams 
of Figs. 12 and 13 respectively; 

Fig. 15 is a general system block diagram of a 
second embodiment of an image recognition apparatus 
according to the present invention; 

Fig. 16 is a basic flow diagram of the operation 
of the second embodiment; 

Fig. 17 is a diagram illustrating an orthogonal 
color space utilized with the second embodiment, in 
which the respective proportions of color values of a 
pixel are expressed as coordinate values, rather than 
the color values themselves; 

Fig. 18 is a diagram for use in describing how an 
edge image is obtained as a result of applying edge 
detection by the apparatus of the second embodiment to 
a simplified color image; 

Fig. 19 is a flow diagram showing details of 
processing to derive edge strength and possible edge 
direction information for each of the pixels of a color 
image in succession, with the second embodiment of the 
invention; 
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Fig. 20 is a diagram illustrating an HSI color 
space utilized with a third embodiment of the 
invention; 

Fig- 21 represents a simplified color image in 
which specific amounts of variation in color values 
occur within various regions of the image; 

Fig. 22 is a diagram showing an edge image which 
is obtained as a result of applying edge detection by 
the apparatus of the third embodiment to the simplified 
color image of Fig. 21; 

Fig. 23 is a flow diagram showing details of 
processing to derive edge strength and possible edge 
direction information for each of the pixels of a color 
image in succession , with the third embodiment of the 
invention; 

Fig. 24 is a diagram illustrating a modified HSI 
color space, of inverted conical form, utilized with a 
fourth embodiment of the invention; 

Fig. 25 is a diagram showing an edge image which 
is obtained as a result of applying edge detection by 
the apparatus of the fourth embodiment to the 
simplified color image of Fig. 21; 

Fig. 26 is a table of examples of sets of hue, 
saturation and intensity values which are derived by 
transforming the color values of respective regions of 
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the color image represented in Fig, 21 into 
corresponding values of a cylindrical (i.e., 
conventional) HSI color space, into an inverse-conical 
form of modified HSI color space, into a double-conical 

5 

modified HSI color space, and into a modified 
cylindrical HSI space respectively; 

Fig. 27 is a partial flow diagram showing details 
of a first part of processing which is executed to 
derive edge strength and possible edge direction 

10 

information for each of the pixels of a color image in 
succession, with the fourth embodiment of the 
invention; 

Fig. 28 is a diagram illustrating a modified HSI 
color space, of double-conical form, utilized with a 

15 

fifth embodiment of the invention; 

Fig. 2 9 is a diagram showing an edge image which 
is obtained as a result of applying edge detection by 
the apparatus of the fifth embodiment to the simplified 
color image of Fig. 21; 

20 

Fig. 30 is a partial flow diagram showing details 
of a first part of processing which is executed to 
derive edge strength and possible edge direction 
information for each of the pixels of a color image in 
succession, with the fifth embodiment of the invention; 

25 
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Fig. 31 is a graph of a saturation value 
modification function which is utilized to transform 
color values into a modified cylindrical form of HSI 
color space, with a sixth embodiment of the invention; 

5 

Fig. 32 is a diagram illustrating the modified 
cylindrical HSI color space that is utilized with the 
sixth embodiment; 

Fig. 3 3 is a diagram showing an edge image which 
is obtained as a result of applying edge detection by 

10 

the apparatus of the sixth embodiment to the simplified 
color image of Fig. 21; 

Fig. 34 is a partial flow diagram showing details 
of a first part of processing which is executed to 
derive edge strength and possible edge direction 

15 

information for each of the pixels of a color image in 
succession, with the sixth embodiment of the invention; 

Fig. 35 is a general system block diagram of a 
seventh embodiment of an image recognition apparatus 
according to the present invention; 

20 

Fig. 3 6 is a conceptual diagram for illustrating 
the principles of a region image; 

Fig. 37 is a basic flow diagram of the operation 
of the seventh embodiment; 

25 
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Fig. 38 is a diagram for use in describing a 
process of eliminating specific small regions from a 
region image, performed by the seventh embodiment; 

Fig. 3 9 is a diagram for use in describing a 

5 

process of eliminating specific small regions from a 
region image, performed by an eighth embodiment of the 
invention; 

Fig. 40 is a basic flow diagram of the operation 
of the eighth embodiment; 

10 

Fig. 41 is a diagram for use in describing a 
process of eliminating specific small regions from a 
region image, performed by a ninth embodiment of the 
invention; 

Fig. 42 is a basic flow diagram of the operation 

15 

of the ninth embodiment; 

Fig. 43 is a general system block diagram of a 
tenth embodiment of an image recognition apparatus 
according to the present invention; 

Fig. 44 is a basic flow diagram of the operation 

20 

of the tenth embodiment; 

Fig. 45 is a diagram for use in describing how 
specific small regions are eliminated from a color 
image, by the apparatus of the tenth embodiment; 

Fig. 46 is a diagram for illustrating the effect 

25 

of the processing of the tenth embodiment in 
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eliminating specific small regions from an edge image 
which has been derived by edge detection processing of 
an actual photograph; and 

Fig- 4 7 shows a set of edge images which have been 

5 

derived by a prior art type of image recognition 
apparatus, with hue, saturation and intensity edge 
images respectively obtained. 

DESCRIPTION OF PREFERRED EMBODIMENTS 
Embodiments of the present invention will be 

10 

described in the following, referring to the drawings. 
It should be noted that the invention is not limited in 
its scope to these embodiments, and that various other 
forms of these could be envisaged. 

A first embodiment of an image recognition 

15 

apparatus according to the present invention will be 
described referring to Fig. 1. As used herein in 
referring to embodiments of the invention, the term 
"image recognition" is used in the limited sense of 
signifying "processing the data of an original color 

20 

image to derive shape data, i.e., data of an edge image 
which expresses only the outlines of objects appearing 
in the original color image". The apparatus is formed 
of a color image data storage section 1 which stores 
the data of a color image that is to be subjected to 

25 

image recognition processing, an image recognition 
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processing section 2 which performs the image 
recognition processing of the color image data, and a 
shape data storage section 3 which stores shape data 
expressing an edge image, which have been derived by 

5 

the image recognition processing section 2. 

The image recognition processing section 2 is made 
up of a color vector data generating section 21, an 
edge template application section 22, an edge strength 
and direction determining section 2 3 and an edge pixel 

10 

determining section 24. The color vector data 
generating section 21 generates respective color 
vectors for each of the pixels of the color image, with 
each color vector expressed as a plurality of scalar 
values which express the color attributes of the 

15 

corresponding pixel and which are coordinates of an 
orthogonal color space having more than two dimensions. 
The edge template application section 2 2 processes the 
pixel vector data by utilizing edge templates as 
described hereinafter, to generate edge vector data. 

20 

Specifically, using four different edge templates with 
this embodiment which respectively correspond to four 
different orientation directions within the color 
image, a corresponding set of four edge vectors are 

derived for each of the pixels of the color image. The 

25 ... 

edge strength and direction determining section 2 3 
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operates on each of the pixels of the color image in 
succession, to determine whether the pixel may be 
situated on an image, and if so, determines the 
direction of orientation of that possible edge and its 
edge strength. The edge pixel determining section 2 4 
operates on the information thus derived by the edge 
strength and direction determining section 23, to 
determine those pixels which are actually judged to be 
edge pixels, and to thereby generate the shape data, 

1. e., data which express an edge image in which only 
the outlines of objects in the original color image are 
represented. 

As shown in the left side of Fig. 2, the image 
data stored in the color image data storage section 1 
are assumed to be represented by respective (x,y) 
coordinates of points in a 2-dimensional plane, i.e., 
each pair of values (x,y) corresponds to one specific 
pixel. It is also assumed that the color attributes of 
each pixel are expressed as a position in an RGB color 
space by three scalar values which are coordinates of 
that space, i.e., as a set of r (red), g (green) and b 
(blue) values, as illustrated on the left side of Fig. 

2 . The function of the color vector data generating 
section 21 is to express the color attributes of each 
pixel of the color image as a plurality of scalar 
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values which are coordinates of a vector in an 
orthogonal color space. If such a set of scalar values 
for each pixel is directly provided from the stored 
data of the color image data storage section 1 it will 

5 

be unnecessary for the color vector data generating 
section 21 to perform any actual processing. However 
if for example the data of the color image were stored 
in the color image data storage section 1 in some other 
form, e.g., with the color attributes of each pixel 

10 

expressed as a set of polar coordinate, or with 
respective index values being stored for the pixels, 
corresponding to respective sets of r, g, b values 
within a RGB table memory, then the color vector data 
generating section 21 would perform all processing 

15 

necessary to convert the data for each pixel to a 
plurality of scalar values that are coordinates of an 
RGB orthogonal color space. 

Moreover if desired, it would be possible for the 
color vector data generating section 21 to be 

20 

controlled to modify the relationships between the 
magnitudes of the r, g, b values of each pixel, to 
thereby modify the relative contributions of these to 
the magnitude of the modulus of a corresponding color 
vector • 



25 
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It will be assumed that each of the r, g and b 
scalar values is formed of 8 bits, so that each value 
can be in the range 0 to 255. Fig. 3 illustrates the 
RGB color space of these coordinates. 

The data of a color image such as that shown in 
the upper part of Fig. 4 will be assumed to be stored 
in the color image data storage section 1, i.e., an 
image in which the objects are a street 4 0 and a 
building 41, in a ground area 42 . The image 
recognition processing section 2 applies edge detection 
to this image, to thereby obtain an edge image as shown 
in the lower part of Fig. 4, which is stored in the 
shape data storage section 3. The edge image is a bi- 
level image, i.e., the black lines in the lower part of 
Fig. 4 correspond to pixels which are situated along 
the edges of objects which appear in the original color 
image, while the white portions correspond to pixels 
which do not correspond to edges. Basically, the edge 
detection that is executed by the image recognition 
processing section 2 serves to detect the change 
between the color of the road 4 0 and the color of 
adjacent areas, and between the color of the building 
41 and the color of adjacent areas, and to judge that 
each position where the amount of such change is large 
corresponds to the position of an edge. The shapes of 
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15 



the street and building are thereby detected as the 
shapes 50, 51 respectively, shown in the lower part of 
Fig. 4. 

Fig. 5 is a flow diagram showing the basic 
features of the operation of the first embodiment, 
which is executed as follows. Step 10: Respective color 
vectors are derived for each of the pixels of the color 
image, with each color vector expressed as a 
combination of scalar values, which in this instance 
are constituted by the aforementioned r, g and b values 
of the pixel. The color vector of a pixel at position 
(x, y) of the color image, having the RGB scalar values 
r(x, y) , g(x, y) , b(x, y) , is expressed by equation (1) 
below 



PV(x, y) 



r r(x, y)^ 

g(x, y) 

Jd(x, y)j 



(1) 



Step 11: local multiplication and summing 
operations are performed using the four edge templates 
hi, h2, h3, h4, to thereby generate edge template data 
EV1, EV2, EV3 , EV4 for each of the pixels of the color 
image. Figs. 6A, 6B, 6C, 6D respectively show the four 
edge templates designated as hi, h2 , h3 , h4 which are 
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utilized with this embodiment. In Fig. 6A, hi is an 
edge template corresponding to an edge that is oriented 
in the left-right direction of the color image, and 
returns a large value when this template is applied to 

5 

an image position where there is an edge that extends 
along the right-left direction. Similarly in Fig. 6B, 
h2 is an edge template for the lower left - upper right 
diagonal direction, in Fig. 6C h3 is an edge template 
for the top - bottom direction, and in Fig. 6D h4 is an 

10 

edge template for the lower right - upper left diacronal 
direction. As shown, each edge template basically 
consists of an array of numeric values which are 
divided into two non-zero sets of values, of mutually 
opposite sign, which are located symmetrically with 

15 

respect to a line of zero values that is oriented in 
the edge direction corresponding to that edge template. 
The values 0, 1, 2, -2 and -1 of the edge template hi 
can be expressed as shown in equations (2) below. 
hl(l,-l)=l, hl(0,-l)=2, hl(l,-l)=l 

20 

hl(-l,0)=0, hl(-0 ,0)=0 , hl(l,0)=0 
hl(-l,-l)=-l, hl(0,l)=-2, hl(l,l)=-l 

(2) 

The multiplication and summing processing that is 
applied between the four both-direction edge templates 

25 

and PV(x, y) is expressed by equations (3) below. 
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1 ! 

EVl(x, y) = YsH hl(k ' 1)PV(* + *, 7 + 1) 

*=-ll=] 
i 1 

EV2(x, y) = £ Z h2 '*' + Jt, y + 1) 

* = -U = l 

EV3(x, yj = h3 ^/ l^vrx + Jt, y + Ij 

*=-l 1 = 1 
1 1 

£V4fx, 7J = hA(k ' UPV(x + *,Y + 1) 

Jt=-1 1 = 1 

(3) 

The above signifies that, designating the image 
position of the pixel that is currently being processed 
(i.e., the object pixel) as (x, y) , a first edge vector 
EV1 (x , y) is obtained by iuUlLipIying the color vector 
P(x-1, y-1) of the pixel which is located at the image 
position (x-1, y-1) by the scalar value that is 

5 specified for the (-1, -1) position in the edge 

template hi, i.e. by the value 1, multiplying the color 
vector P(x, y-1) of the pixel which is located at the 
image position (x, y-1) by the scalar value that is 
specified for the (0, -1) position in the edge template 

0 hi, i.e. by the value 2, and so on. In that way, the 
edge template hi is applied to the color vector of the 
object pixel and to the respective color vectors of 
eight pixels which are located immediately adjacent to 
the object pixel in the color image. A set of nine 

5 vectors is thereby obtained, and the vector sum of 
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these is then calculated, to obtain a first edge vector 
EVl(x,y) . 

The above array multiplication and vector summing 
process is applied using the other three edge templates 

5 

h2, h3, h4 in the same manner, to the object pixel and 
its adjacent pixels, to obtain the edge vectors 
EV2(x,y), EV3(x,y) andEV4(x,y) respectively 
corresponding to these other three edge templates. The 
above process is executed for each of the pixels of the 

10 

color image in succession, as the object pixel. 

Fig. 7 shows the four edge vectors that are 
obtained as a result of applying the four edge 
templates of Fig. 6 to the color vector (r, g, b values 
72, 183, 207 respectively) of the center pixel in the 

15 

diagram at the right side of Fig. 2. EV1 is the edge 
vector corresponding to the left - right direction, EV2 
corresponds to the lower left - upper right diagonal 
direction, EV3 corresponds to the bottom - top 
direction, and EV4 corresponds to the lower right- 

20 

upper left diagonal direction. 

Step 12: Using these edge vectors EV1, EV2, EV3 , 
EV4, the strength and orientation of an edge on which 
the object pixel may be located are determined. That 
edge will be referred to in the following as the 

25 

"presumptive edge" obtained for the object pixel, which 
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may or may not be subsequently confirmed to be an 
actual edge as described hereinafter. The strength of 
the presumptive edge obtained for the object pixel 
having the image position (x, y) , which is obtained as 
the value of the largest of the four moduli of the edge 
vectors EV1, EV2 , EV3 , EV4 , will be designated as 
M MOD(x,y) 11 , and the direction of that presumptive edge 
will be designated as "DIR(x,y) " . That is to say, 
applying processing in accordance with equation (4) 
below, respective values of strength of the presumptive 
edge, M0D(x,y) is obtained for each of the pixels of 
the color image in succession, and the strength values 
are stored temporarily. 
mod(x, y) = 

max(\EV\(x, y\ , \EV2(x, y\ , \EV3(x, yX , \EV4(x, yX) 

(4) 

If it is found when attempting to apply equation 
(4) that none of the moduli of the edge vectors 
obtained for a pixel exceeds all of the other edge 
vector moduli obtained for that pixel, then this may 
result from all of the moduli of the edge vectors 
EVl(x,y), EV2(x,y), EV2(x,y), EV3(x,y) corresponding to 
the respective edge templates hi to h4 being of equal 
magnitude. In that case no possible edge direction is 
obtained for the object pixel, however the modulus 
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value of the edge vectors is stored/ as the edge 
strength value MOD obtained for that pixel, for use in 
subsequent processing. 

Next, successively selecting each of the pixels 

5 

(i.e., those pixels for which a presumptive edge has 
been obtained) as the object pixel and applying 
processing in accordance with equation (5) below, the 
orientation of the presumptive edge, designated in the 
following as DIR(x,y) , is obtained for each of the 

10 

pixels. That orientation is the direction 
corresponding to the edge template whose application 
resulted in generation of the edge strength value 
MOD(x,y) for that pixel. Information specifying the 
obtained edge directions for the respective pixels is 

15 

temporarily stored. 

" Left-right" if mod(x, ••• y)=)fiV\(x,y){ 
" Lover left-top right " if mod(x f y)^V2(x r y\ 

dir(x, y) = 

J " Bottom-top" if mod(x,y)=$:v3(x, yJi 

" Lover right-top left" if mod(x,y)=\EV4(x,yA 

20 

(5) 

For example, comparing the magnitudes of the 
respective moduli of the edge vectors shown in Fig. 7, 
the magnitude for EV3 is 437, which is larger than the 
25 magnitudes of each of the other edge vector moduli, so 



that as shown in Fig, 8, the strength MOD of the 
presumptive edge of that pixel is obtained as 437. 
Also, since that edge strength value corresponds to the 
edge template h3 shown in Fig. 6, the edge direction 
5 of that presumptive edge is determined as being the 
bottom - top direction of the color image. 

Step 13: the edge image data "EDGE" are generated, 
using a predetermined edge strength threshold value t, 
the respective presumptive edge strength values "MOD 11 
10 obtained for the pixels of the color image and the 
respective edge directions "DIR" obtained for the 
pixels, in the manner indicated by equation (6) below 

edge (x,y) = "edge" if: 
(mod(x,y) z t 

15 

& ( (dir (x, y) =" left-right" direction 
&mod(x,y)> mod (x,y-l) &mod (x,y)> mod(x,y+l)), 

or if (mod (x,y) z t 
&( (dir (x,y)=" lower left-top right" direction 
&mod(x,y)> mod (x-l f y-l) &mod (x,y) > mod (x+l,y+l) ) , 

20 

or if (mod (x,y) z t 
& ( (dir (x, y) =" bottom-top" direction 
&mod (x,y)> mod (x-l,y) &mod (x,y)> mod(x+l,y) ) , 

or if (mod(x f y) z t 
& ( (dir(x f y) =" left-right" direction 

25 

&mod(x,y)> mod (x,y-l) &mod (x,y)> mod(x,y+l)). 
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Otherwise, edge(x,y) * "edge" 

(6) 

That is to say, the pixels for which respective 
presumptive edges (i.e., possible edge directions) have 
been derived are successively selected as the object 
pixel, with the threshold value t, edge strength 
MOD(x,y) and edge direction DIR(x,y) of the object 
pixel being used to make a decision as to whether or 
not the object pixel actually is an edge pixel. With 
equation (6), if a pixel has an edge strength that is 
higher than t, and the relationship between that pixel 
and the adjacent pixels satisfies one of the four 
patterns which are shown in Figs. 9A to 9C, then it is 
judged that this is an edge pixel. 

More specifically, numeral 200 in Fig. 9A 
designates an array of six pixels of the color image, 
centered on a pixel 202 which is currently being 
processed as the object pixel. The designations 
"weak", "strong" indicate the relationships between the 
respective values of strength that have been previously 
obtained for the pixels as described above. In Fig. 9A, 
it is assumed that the edge strength MOD obtained for 
pixel 2 02 is the edge vector modulus that is obtained 
by using the edge template hi shown in Fig. 6, i.e., 
EVl(x,y) in equation (3) described above, and hence the 
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orientation DIR of the presumptive edge corresponding 
to pixel 2 02 is the left-right direction of the color 
image, i.e. a presumptive edge has been derived for 
pixel 2 02 as a straight line of undefined length which 

5 

passes through that pixel and is oriented in the 
horizontal direction of Fig- 9A. It is further assumed 
in Fig, 9A that the respective values of edge strength 
derived for the two pixels 201, 2 03 which are 
immediately adjacent to the object pixel 2 02 and 

10 

disposed on opposing sides of the presumptive edge 
derived for the object pixel 2 02 are both less than the 
value of strength that has been derived for the 
presumptive edge of the object pixel 202. In that 
condition, if that edge strength value obtained for the 

15 

object pixel 2 02 exceeds the edge threshold value t, 
then it is judged that pixel 2 02 is located on an 
actual edge within the color image. 

Similarly in Fig. 9B, the presumptive edge that 
has been derived for the object pixel 205 is a line 

20 

extending through the pixel 2 05, oriented in the lower 
left-upper right diagonal direction of the color image, 
and the respective values of strength derived for the 
two pixels 204, 206 which are immediately adjacent to 
the object pixel 2 05 and disposed on opposing sides of 

25 

the presumptive edge derived for the object pixel 2 05 
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are both less than the value of strength that has been 
derived for the presumptive edge of the object pixel 
2 05. Thus in the same way as for the example of Fig. 
9A, assuming that the edge strength value obtained for 

5 

the object pixel 2 05 exceeds the edge threshold value 
t, it will be judged that pixel 205 is located on an 
actual edge, oriented diagonally as shown in Fig. 9B 
within the color image. In a similar way, it will be 
judged that the object pixel is located on an actual 

10 

vertically oriented edge if the pattern condition of 
Fig. 9C is satisfied, or on an actual edge which is 
oriented along the lower right-upper left diagonal 
direction, if the pattern condition of Fig. 9D is 
satisfied. 

15 

As can be understood from the above description, 
the effect of applying one of the edge templates shown 
in Figs. 6A to 6D to an array of color vectors centered 
on an object pixel is to obtain (as an edge vector) the 
vector difference between the weighted vector sum of 

20 

the color vectors of a first set of pixels which are 
located on one side of the object pixel with respect to 
the edge direction of that template (i.e., whose 
vectors are multiplied by 1, 2, and 1 respectively) and 
the weighted vector sum of the color vectors of a 

25 

second set of pixels which are located on the opposite 
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side of the object pixel (i.e., whose vectors are 
multiplied by -1, -2 and -1, respectively) . It will be 
further understood that the invention is not limited to 
the configurations of edge templates utilized with this 

5 

embodiment. 

Fig, 11 is a flow diagram showing details of the 
processing performed in steps 10 to 12 of Fig. 5, to 
derive the edge vectors and the edge strength "mod" and 
edge direction "dir" information for the pixels of the 

10 

color image that is to be processed. The sequence of 
steps 1001 to 1010 of Fig. 11 are repetitively executed 
for each of the pixels of the color image in 
succession, i.e., with the pixels being successively 
selected as the object pixel for which mod and dir 

15 

information are to be derived. In step 1002, a 
plurality of scalar values expressing a color vector in 
that orthogonal RGB color space for the object pixel 
are read out from the color image data storage section 
1 (i.e., the r, g and b values for the object pixel) as 

20 

are also the respective sets of RGB values expressing 
the color vectors of the group of eight pixels which 
are immediately adjacent to the object pixel and 
surround the object pixel. In step 1003 that array of 
nine color vectors is successively multiplied by each 

25 

of the arrays of values which constitute the edge 
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templates hi, h2, h3 and h4 , in the manner described 
hereinabove, with the respective vector sums of the 
results being obtained as the edge vectors EV1, EV2 , 
EV3 and EV4 . In step 1004, the moduli of these edge 
vectors are obtained and are compared, to find if one 
of these is greater than each of the other three. If 
this condition is met, as determined in step 1006, then 
that largest value of modulus is temporarily stored in 
an internal memory (not shown in the drawings) as the 
edge strength M0D(x,y) of the object pixel, together 
with information indicating the direction corresponding 
to that largest edge vector as the orientation DIR(x,y) 
of the object pixel. 

However if the condition whereby one of moduli of 
EV1, EV2, EV3 and EV4 is greater than each of the other 
three is not satisfied then step 1005 is executed to 
judge whether all of the vector moduli have the same 
value. If that condition is found, then no direction 
can be obtained as DIR(x,y) for the object pixel, and 
only that modulus value is stored as the edge strength 
M0D(x,y) for the object pixel, in step 1007. If that 
condition is not found (i.e., two or three of the 
vector moduli have the same value, which is greater 
than that of the remaining one(s)) then the modulus of 
an arbitrarily selected one of the edge vectors which 
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have the largest value is selected as the edge strength 
MOD(x,y) of the object pixel, while the orientation of 
the edge template corresponding to that selected edge 
vector is stored as the edge direction DIR(x,y) of the 

5 

object pixel, in step 1008, 

Fig. 12 is a flow diagram showing details of the 
processing performed in step 13 of Fig, 5, to derive 
the shape data which are to be output and stored in the 
image recognition processing section 2, i.e., to find 

10 

each of the pixels which is actually located on an edge 
within the color image, and the corresponding edge 
direction. The sequence of steps 1011 to 1017 of Fig. 
12 is successively applied to each of the pixels of the 
color image for which edge direction' information DIR 

15 

has been derived and temporarily stored, together with 
corresponding edge strength information MOD, as 
described above. In steps 1011, 1012 the next pixel to 
which this processing is to be applied as the object 
pixel is selected, and the edge strength MOD(x,y) and 

20 

edge direction DIR(x,y) information for that object 
pixel are read out. If it is judged in step 1013 that 
the value of MOD(x,y) is greater than or equal to the 
edge threshold value t, then step 1004 is executed, to 
read out the respective values of edge strength of the 

25 

two pixels which are located immediately adjacent to 
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the object pixel and on mutually opposite sides of the 
presumptive edge that has been detected for the object 
pixel. 

Next, in step 1015, the three values of edge 

5 

strength are compared, to determine if the edge 
strength M0D(x,y) of the object pixel is greater than 
the edge strengths of both these adjacent pixels. If 
so, then the pixel which corresponds in position to the 
object pixel within the image expressed by the shape 

10 

data (i.e., the edge image) is specified as being 
located on an actual edge, which is oriented in the 
direction DIR(x,y) . In that way, the shape data 
expressing the edge image are successively derived as 
binary values which indicate, for each pixel of the 

15 

color image, whether or not that pixel is located on an 
edge. 

It can thus be understood that with the above 
processing, a pixel of the color image, when processed 
as the object pixel, will be judged to be located on an 

20 

actual edge within the color image if it satisfies the 
conditions: 

(a) an edge direction DIR, and also a value of 
edge strength MOD that exceeds the edge threshold value 
t, have been obtained for that object pixel, and 

25 
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(b) the edge strength MOD of that object pixel is 
greater than each of the respective edge strengths of 
the two pixels which are located immediately adjacent 
to the object pixel and are on mutually opposite sides 

5 

of a presumptive edge (i.e., a line which is oriented 
in direction DIR, passing through that pixel) that has 
been obtained for the object pixel. 

With the operation of Fig. 11, Fig. 12 described 
above, in the event that it is found in step 1005 that 

10 

there are a plurality of edge vectors having the same 
magnitude of modulus, which is greater than that of the 
remaining vector(s), for example if the moduli of EV1, 
EV2 are identical and each are larger than the 
respective moduli of EV3 , EV4 , then the edge direction 

15 

corresponding to an arbitrarily selected one of the 
largest edge vectors is selected to be used as the edge 
direction DIR of the object pixel, in step 1008. 
However various other procedures could be used when 
such a condition occurs. An alternative procedure is 

20 

illustrated in the flow diagrams of Figs. 13, 14. In 
step 1008b of Fig. 13, the respective edge template 
directions corresponding to each of the edge vectors 
having the largest moduli are all stored as candidates 
for the edge direction DIR of the object pixel, 

25 

together with the maximum edge vector modulus value as 



: ; 
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the edge strength MOD. In that case, as shown in Fig. 
14, if the pixel which has been selected as the object 
pixel in step 1011 is found to have a plurality of 
corresponding candidate edge directions DIR stored, 

5 

then the information specifying these different 
directions are successively read out in repetitions of 
a step 1012b. That is to say, the processing of steps 
1012b to 1015 is repetitively executed for each of 
these directions until either it is found that the 

10 

condition of step 1015 is satisfied (the pixel is 
judged to be on an actual edge) or all of the candidate 
edge directions for that pixel have been tried, as 
judged in step 1018. In other respects, the 
processing shown is identical to that of Figs. 11, 12 

15 

described above. 

A specific example will be described in the 
following. The upper part of Fig. 10 shows data of a 
color image, expressed as coordinates of an RGB color 
space, representing a simplified aerial photograph 

20 

which is to be subjected to image recognition. The 
image is identical to that of Fig. 4, containing a 
street, ground, and a building, with the building roof 
and first and second side faces of the building 
appearing in the image. Respective RGB values for each 

25 

of these are assumed to be as indicated in the drawing. 
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For example it is assumed that each of the pixels 
representing the ground surface have the r, g and b 
values 195, 95 and 0 respectively. By applying the 
first embodiment of the invention to this image to 

5 

process the data of the color image in the manner 
described above, bi-level shape data are obtained from 
the image recognition processing section 2 and stored 
in the shape data storage section 3 , with the shape 
data expressing the outlines of the street and the 

10 

building roof and side faces in the form of edges, as 
shown in the lower part of Fig. 10, i.e., with the 
shape of the street formed as two edges 50, and the 
shape of the building roof and side faces being formed 
as the set of edges 51. 

15 

As described above, with the present invention, 
pixel vector data are generated as combinations of 
pluralities of scalar values constituting pixel values, 
and edge detection is performed by operating on these 
pluralities of scalar values. With prior art types of 

20 

edge detection which operate only upon values of 
intensity, even if the outlines of a body exist within 
an image but the outlines are not in the form of 
variations in intensity, then edge detection cannot be 
achieved for that body. However with the present 
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invention, in such a condition, edge detection becomes 
possible. 

Furthermore, by applying edge templates to pixel 
vector data, edge directions can be obtained easily and 
reliably. If the direction of an edge is known, then 
it becomes possible to form that edge as a continuous 
line (as expressed in the shape image that is 
generated) even if all of the pixels corresponding to 
that edge are not detected. That is to say, if the 
direction of an edge can be reliably obtained on the 
basis of a part of the pixels of that edge, then 
interpolation of the remaining pixels can readily be 
performed, to thereby eliminate any breaks in the 
continuity of the edge. For that reason, the basic 
feature of the present invention whereby it is possible 
not only to detect the strengths of edges, but also to 
reliably estimate their directions, is highly 
important. 

A second embodiment of an image recognition 
apparatus according to the present invention is shown 
in the general system block diagram of Fig. 15. Here, 
sections having similar functions to those of the 
apparatus of the first embodiment shown in Fig. 1 are 
designated by identical reference numerals to those of 
Fig. 1. In the apparatus of Fig. 15, the color vector 
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data generating section 121 performs a similar function 
to that of the color vector data generating section 21 
of the first embodiment, but in addition receives 
control parameter adjustment data, supplied from an 

5 

external source as described hereinafter. In addition, 
the apparatus of Fig, 15 further includes a color space 
coordinates conversion section 25 is for performing 
color space coordinate transform processing. The data 
stored in the image data storage section 1, which in 

10 

the same way as described for the first embodiment will 
be assumed to directly represent a color image as sets 
of r, g, b values that are coordinates of an RGB color 
space, are transformed to cords of a different 
orthogonal color space, specifically, a color space in 

15 

which chrominance and intensity values are mutually 
separated. Color vectors are then generated for each 
of the pixels data by the color vector data generating 
section 121 using the results of the transform 
operation, 

20 

Fig. 16 is a flow diagram showing the basic 
features of the operation of the second embodiment. 

Steps 11, 12 and 13 of this flow diagram are 
identical to those of the basic flow diagram of the 
first embodiment shown in Fig. 5. Step 10 of this flow 

25 

diagram differs from that of the first embodiment in 
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that color vector modulus adjustment can be performed, 
as described hereinafter, A new step 2 0 is executed as 
follows. 

Step 20: the color attribute data of each pixel 
are transformed from the RGB color space to coordinates 
of the color space shown in Fig. 17. Specifically, 
each set of pixel values r(x, y) , g(x, y) , b(x, y) is 
operated on, using equation (7) , to obtain a 
corresponding set of coordinates cl(x, y) , c2 (x, y) , 
c3(x, y) . Here, cl expresses a form of intensity value 
for the pixel, i.e., as the average of the r, g and b 
values of the pixel, c2 expresses the proportion of 
the red component of that pixel in relation to the 
total of the red, green and blue values for that pel, 
and C3 similarly expresses the proportion of green 
component of that pixel in relation to the total of the 
red, green and blue values of that pixel. 

r(x, y) + g(x, y) + h( x, y) 
c\(x, y) = ~ 

jtCx v ) 

c2(x, y) = — . ma.x_ value 

r(x, y) + g(x, y) + b(x, y) 

c3(x, y) = 9(Xf Y) . max_ value 

r(x, y) + g(x, y) + h(x, y) 

(7) 

As can be understood from the above equation and 
Fig. 17, the color attributes of a pixel having the 
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maximum r value (i.e., 255) and zero g and b values, in 
the RGB color space, are expressed as a position within 
the color space of Fig. 17 which has the cl, c2 , c3 
coordinates (255/3, 255, 0). This is the point 

5 

designated as "red" in Fig. 17. Similarly, points 
which correspond to the "maximum blue component, zero 
red and green components" and "maximum green component, 
zero red and blue components" conditions within the 
RGB color space are respectively indicated as the 

10 

"blue" and "green" points in Fig. 17. 

Step 10: the pixel vector data PV are generated 
from the pixel values. Pixel vector data are generated 
for each pixel based on a combination of the attribute 
values of the pixel. A vector data set PV(x, y) is 

15 

generated for each of the pixels, by applying equation 
(8) below to the pixel values cl(x, y) , c2(x, y) , c3(x, 
y) . By adjusting the parameters al, a2 and a3 of 
equation (8) , through input of control parameter 
adjustment data to the color vector data generating 

20 

section 121, it is possible to determine whether the 
edge detection will be based mainly on the cl values, 
the c2 values, or on the c3 values, i.e., the relative 
contributions made by the cl, c2 and c3 coordinates of 
a color vector to the magnitude of the modulus of the 

25 

color vector can be adjusted by altering the values of 
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the control parameters al, a2 and a3. The resultant 
color vector is expressed as follows. 



PV(x, y) = 



r a\ .cl(x, y) N 
al .c2(x, y) 
K a3 .c3(x, y) J 



(8) 



Fig. 19 is a flow diagram showing the processing 
executed with this embodiment to derive the candidate 
edge strength values (MOD) and edge directions (DIR) 
for the pixels of the color image. As shown, this 
differs from the corresponding diagram of Fig. 11 of 
the first embodiment only with respect to the steps 
1002a, 1002b which replace step 1002 of Fig. 11, for 
deriving the color vectors as sets of coordinates 
expressing respective positions within the color space 
of Fig. 17. 

A specific example will be described in the 
following. The upper part of Fig. 18 shows data of a 
color image representing a simplified aerial photograph 
which is to be subjected to image recognition. 
Examples of the r, g and b values for various regions 
of the color image, and the corresponding sets of cl, 
c2 , c3 values which express the color attributes of 
these regions as positions in the color space of Fig, 
17 are also indicated in the drawing. As described 
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above the respective sets of r, g and b values of the 
pixels, for the RGB color space, are converted to 
corresponding sets of cl, c2 , c3 coordinates, the 
values of the control parameters al, a2 , a3 are set in 
5 accordance with the characteristics of the color image 
(for example if required, such that differences in 
respective intensity values between adjacent regions 
will have a relatively large effect upon the 
differences between magnitudes of corresponding color 

10 vectors as described hereinabove) , and respective color 
vectors for the pixels of the color image, expressed in 
the color space of Fig. 17, are thereby obtained. Edge 
detection is then performed, to obtain the shape of the 
street and the building as designated by numerals 50 

15 and 51 respectively in the lower part of Fig. 18. 

As described above, with this embodiment, 
respective color vectors for the pixels of the color 
image are derived by transform processing of the stored 
image data into coordinates of a color space which is 

20 more appropriate for edge detection processing than the 
original RGB color space. That is to say, the image 
data are subject to conversion to color space 
coordinates whereby the edge detection processing can 
be adjusted (i.e., by altering the relative values of 

25 the control parameters) such as to match the edge 
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detection processing to the particular characteristics 
of the image that is to be subjected to image 
recognition processing. For example, if differences 
between various regions of the image are primarily 
5 gray-scale variations, i.e., variations in intensity 

rather than in chrominance, then this fact can readily 
be judged beforehand by a human operator, and the 
control parameter values adjusted such as to emphasize 
the effects of variations in intensity values upon the 

10 edge detection process. 

A third embodiment of an image recognition 
apparatus according to the present i invention will be 
described. The apparatus configuration is identical to 
that of the second embodiment (shown in Fig. 15) . 

15 The basic operation sequence of this embodiment is 

similar to that of the second embodiment, shown in Fig. 
16. However with the third embodiment, the transform is 
performed from an RGB color space to an HSI color 
space, instead of the color space of Fig. 17. That is 

20 to say, steps 11, 12 and 13 are identical to those of 
the first embodiment, however step 2 0 is performed as 
follows. Step 20: each pixel value is transformed from 
the RGB color space to the coordinates of the 
cylindrical color space shown in Fig. 20. Each set of 

25 pixel values r(x, y) , g(x, y) , b(x, y) is operated on, 
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using equation (9) , to obtain a corresponding set of 
hue, saturation and intensity values as h(x, y) , s(x, 
y) and i(x, y) respectively of the HSI color space of 
Fig. 20. In this case, the gray-scale values, i.e. 
values of intensity extending from black (as value 0) 
to white (as maximum value) , are plotted along the 
vertical axis of the cylindrical coordinate system 
shown in the left side of Fig. 20. 



10 



imax=max (r, g ,b) 
imin=min (r, g,b) 



i inn jl-t i uiu.il 
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fo 



s = < 



i max~un±n 



max value 



imax+imm 



imax-xmin 



If Imax = imln 



If 1 < 



max value 



max value-imax-imln 



max value If 1 > 



max value 



20 



rl = 



imax-r 



xmax-imin 



g\ = 



imax-g 
Imax-imln 



25 



jbl = 



imax-b 



xmax-imin 



( 9 ) 
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rl = 



imax -r 
Imax-imin 



iinax-imin 



Jbl = 



imax -Jb 



linax-imm 
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undefined 



(2 + rl - bl)n 



(4 + g\ - r\)n 



If imax = imin 



If r = Imax 



if g = imax 
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The saturation value expresses the depth of a 
color, and corresponds to a distance extending radially 
from the center of the coordinate system shown in the 
right side of Fig. 20. The hue value corresponds to an 
angle in the coordinate system shown on the right side 
of Fig. 20. For example when this angle is zero 
degrees, this corresponds to the color red, while an 
angle of 2 /3% radians corresponds to blue. 

It should be noted that there are various models 
for performing the transform from an RGB to an HSI 
color space, and that the present invention is not 
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limited to use of equation (9) for that purpose. With 
equation (9) the range of values of each of r, g, b, i, 
and s is from 0 to the maximum value (i.e., 255 in the 
case of 8-bit data values), designated as n max__value" . 
The range of values of h is from 0 to 2n radians. For 
simplicity, the image position coordinates (x, y) have 
been omitted from the equation. 

With this embodiment, step 10 of the flow diagram 
of Fig. 16 is executed as follows. Using equation (10) 
below, color vectors PV(x, y) are generated for each of 
the pixels, from the hue, saturation and intensity 
values h(x, y) , s(x, y) , i(x, y) of each pixel. 



15 



PV(x, y) = 



r a.s(x, y) . cos (h(x, y) ) 



a.s(x, y) . sin (h(x, y) ) 



Ux, y) 



(10) 
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Here each color vector PV is generated by 
converting the portions h(x, y) , s(x, y) that are 
expressed in polar coordinates to a linear coordinate 
system. By adjusting the value of the control 
parameter "a", it becomes possible for example to place 
emphasis on the intensity values, in the edge detection 
processing. For example if the value of the parameter 
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"a" is made equal to 1, then edge detection processing 
will be performed placing equal emphasis on all of the 
values in the HSI space, while if the value of the 
parameter a is made less than 1, then edge detection 

5 

processing will be performed placing greater emphasis 
on intensity values. 

That is to say, the relative contribution of the 
intensity component of the color attributes of a pixel 
to the magnitude of the modulus of the color vector of 

10 

that pixel will increase in accordance with decreases 
in the value of the control parameter "a". 

The operation of this embodiment for generating 
respective color vectors corresponding to the pixels of 
the color image is shown in more detail in the flow 

15 

diagram of Fig. 23. This differs from the corresponding 
flow diagram of Fig. 11 for the first embodiment in 
that the step 1002 of the first embodiment, for 
deriving the array of color vectors PV which are to be 
operated on using the edge templates in equation (2) as 

20 

described above to obtain the edge vectors EVl(x,y) to 
EV2(x / ) / is replaced by a series of three steps, 1002a, 
1002c and 1002d. 

In the first of these, step 1002a, the respective 
sets of r, g, b values for the object pixel and its 

25 

eight adjacent surrounding pixels are obtained from the 



64 



image data storage section 1, and in step 1002c each of 
these sets of r, g, b values of the RGB color space is 
converted to a corresponding set of h, s, i values of 
the cylindrical HSI color space shown in Fig. 20. In 

5 

step 1002d, each of these sets is converted to a 
corresponding set of three linear coordinates, i.e., of 
an orthogonal color space, using the trigonometric 
operation described above, to thereby express the hue 
and saturation information of each pixel in terms of 

10 

linear coordinates instead of polar coordinates, while 
each of the resultant s.cos h and s.sin h values is 
multiplied by the control parameter "a", as indicated 
by equation (10) . 

A specific example will be described in the 

15 

following. Fig. 21 shows data of a color image 
representing a simplified aerial photograph which is to 
be subjected to image recognition. As opposed to the 
image of the upper part of Fig. 10, it is assumed with 
the image of Fig. 21 that there are ranges of variation 

20 

of pixel values, as would occur in the case of an 
actual aerial photograph. Thus in each of the regions 
of the color image, rather than all of the RGB values 
of that region being identical, there is a certain 
degree of scattering of these pixel values. 

25 
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As described above, the color attributes of the 
pixels of the color image are converted from RGB to HSI 
color space coordinates, which are then converted to 
respective coordinates of an orthogonal system by 

5 

applying equation (10) above, to thereby obtain 
respective color vectors corresponding to the pixels, 
and edge detection processing then applied to the color 
vectors in the same manner as described for the first 
embodiment. The result of applying this processing to 

10 

the image shown in Fig, 21 is illustrated in Fig. 22. 
As shown, the shapes of the street and the building 
have been extracted from the original image, as 
indicated by numerals 52 and 53 respectively. Due to 
the scattering of pixel values in the original color 

15 

image, some level of noise will arise in the edge 
detection process, so that as shown in Fig. 21, some 
discontinuities occur in the outlines of the street and 
the building. 

Thus with this embodiment of the present 

20 

invention, pixel vector data are generated after having 
converted pixel values which have been stored as 
coordinates of a certain color space into the 
coordinates of an HSI color space, which are then 
converted to linear coordinates of a color space in 

25 

which the luminance and chrominance information 
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correspond to respectively different coordinates. This 
simplifies edge detection, since the overall hue, 
saturation and intensity characteristics of a color 
image can be readily judged by a human operator, and 

5 

the value of the control parameter "a 11 can thereby be 
set appropriately by the operator, to enable effective 
edge detection to be achieved. 

A fourth embodiment of an image recognition 
apparatus will be described. The configuration is 

10 

basically similar to that of the second embodiment 
(shown in Fig. 15) . 

The operation sequence of this embodiment is 
similar to that of the second embodiment, shown in the 
flow diagram of Fig. 16, with steps 11, 12 and 13 being 

15 

identical to those of the first embodiment. The 
contents of step 2 0 of Fig. 16, with the fourth 
embodiment, differ from those of the second embodiment 
and are as follows. 

Step 20: the pixel values are transformed from the 

20 

RGB color space to the coordinates of the cylindrical 
HSI color space shown in Fig. 20, using equation (9) as 
described hereinabove for the third embodiment. 
Equation (11) is then applied to transform the 
respective sets of h, s, i values obtained for each of 

25 

the pixels of the color image pixel to the coordinates 
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of a color space of the inverted conical form shown in 
Fig. 24, i.e., to coordinates h', s 1 , i 1 of a modified 
form of HSI color space. 

5 h' (x, y) = h(x, y) 

s' (x, y) = .s(x, y) 

max_ value 

i' (x, y) = ±(x, y) 
Thus, the color space transform operation is 

10 

performed by applying equation (11) above to convert 
each h(x, y) , s(x, y) , i(x, y) set of values, for the 
pixel located at position (x, y) of the color image, to 
a set of h'(x, y) , s'(x, y) , i'(x, y) values 
respectively. This transform does not produce any 

15 

change between h(x, y) and h ! (x, y) , or between i(x, y) 
and i'(x, y) , however as the value of i(x, y) becomes 
smaller, the value of s' (x, y) is accordingly reduced. 

With this embodiment, the contents of step 1010 of 
the flow diagram of Fig. 16 are as follows. Respective 

20 

color vectors are generated for each of the pixels, 
with the vectors expressed as respective sets of linear 
coordinates of an orthogonal color space, by applying 
equation (12) below to the set of polar coordinates 
h 1 (x, y) , s'(x, y) , i f (x, y) that have been derived for 

25 

the pixel by applying equation (11) 
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PV(x,y) = 



a.s(x, y) . cos (h(x, y) ) 
a.s(x, y) . sin (h(x, y) ) 
{ ±(x, y) 



(12) 



25 



Thus, each color vector is generated by converting 
the portions h ! (x, y) , s 1 (x, y) of the h', s 1 , i 1 
information for each pixel , i.e., the values that are 
expressed in polar coordinates, to a linear coordinate 
*^yc;-t-o"m By ?_dju?ting the value of the parameter 11 cx 11 
the form of emphasis of the edge detection processing 
can be altered, i.e., the relative contribution of the 
intensity component of the color attributes of each 
pixel to the magnitude of the modulus of the color 
vector that is derived for the pixel can be modified, 
by adjusting the value of the control parameter "a", so 
that it becomes possible to place emphasis on 
variations in intensity between adjacent regions, in 
the edge detection processing. For example if the 
value of the parameter "a" is made equal to 1, then 
edge detection processing will be performed placing 
equal emphasis on all of the hue, saturation and 
intensity values, while if the value, of the parameter 
"a" is made less than 1, then edge detection processing 
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will be performed placing greater emphasis on 
intensity values. 

The operation of this embodiment for generating 
respective color vectors corresponding to the pixels of 

5 

the color image is shown in the partial flow diagram of 
Fig. 27. This differs from the corresponding flow 
diagram of Fig. 11 for the first embodiment in that the 
step 1002 of the first embodiment, for deriving the 
array of color vectors PV which are to be operated on 

10 

by applying the edge templates in equation (2) as 
described above to obtain the edge vectors EVl(x,y) to 
EV2(x,), is replaced by a series of four steps, 1002a, 
1002c, 1002e and 1002f. In step 1002a, the respective 
sets of r, g, b values for the object pixel and its 

15 

eight adjacent surrounding pixels are obtained from the 
color image data storage section 1, and in step 1002c 
each of these sets of r, g, b values of the RGB color 
space is converted to a corresponding set of h, s, i 
values of the cylindrical-shape HSI color space shown 

20 

in Fig. 20. In step 1002e, each of these sets of h, s, 
i values is converted to a corresponding set of h 1 , s', 
i f values of the inverted-conical H'S'I* color space. 
In step 1002f, each of these sets is converted to a 
corresponding set of three linear coordinates, i.e., of 

25 

an orthogonal color space, while each of the resultant 
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s'.cos h 1 and s'.sin h 1 values is multiplied by the 
control parameter "a 1 *, as indicated by equation (12). 

The remaining steps of this flow diagram, which 
are omitted from Fig, 27 , are identical to steps 1003 

5 

to 1010 of Fig- 11. 

A specific example will be described in the 
following. In the same way as for the third 
embodiment, it will be assumed that the simplified 
aerial photograph of Fig. 21 is the image that is to be 

10 

subjected to recognition processing. 

As described above, the RGB values of the pixels 
are first converted to HSI values of the cylindrical 
color space of Fig. 2 0 , and these are then transformed 
to H^'I 1 form, as coordinates of the inverted-conical 

15 

color space shown in Fig. 24. The first and second 
columns of values in the table of Fig. 2 6 show the 
relationship between respective HSI values for each of 
the regions, and the corresponding H'S'I 1 values 
resulting from the transform. In the case of the 

20 

transform into the HSI space, the lower the values of 
intensity become, the greater will become the degree of 
scattering of the values of saturation. This is a 
characteristic feature of the transform from RGB to the 
HSI space. For example, if all of the RGB values of a 

25 

pixel are small, signifying that the intensity is low, 
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then a change of 1 in any of the RGB values will result 
in an abrupt change in the corresponding saturation 
value. Thus, since sudden changes in color will occur 
at positions where such abrupt variations in the 

5 

saturation values occur, edges may be erroneously 
detected even at positions where there is no actual 
border of any of the objects which are to be 
recognized. However in the case of a transform into 
H 1 S 1 1 1 values of the inverse-conical HSI space, the 

10 

lower the value of intensity of the pixels, the smaller 
will become the value of s 1 , so that the scattering of 
the values of s 1 is suppressed. As a result, random 
abrupt changes in the magnitudes of the moduli of the 
color vectors which are derived by applying equation 

15 

(12) can be eliminated, enabling greater accuracy of 
edge detection. 

Fig. 2 5 shows the image recognition processing 
results which are obtained when this embodiment is 
applied to edge detection of the color image 

20 

represented in Fig. 21. The building face 1 and 
building face 2 in the image of Fig. 21 are each 
regions of low values of intensity, so that the noise 
level for these regions, due to erroneous detection of 
spurious edges, could be expected to be high. However 

25 

as shown in Fig. 25, such noise is substantially 
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suppressed, with the shapes of the street and building 
of the image of Fig. 21 being extracted as indicated by 
numerals 54, 55 respectively. 

Thus as described above, with this embodiment, 

5 

when color values are transformed into the HSI space, 
the saturation values are varied in accordance with the 
intensity values by converting the h, s and i values 
for each pixel to a corresponding set of values that 
are coordinates of an inverted-conical shape of color 

10 

space, so that the instability of values of saturation 
that is a characteristic feature of the transform from 
RGB to HSI values can be reduced, whereby the 
occurrence of noise in the obtained results can be 
substantially suppressed, and reliable edge detection 

15 

can be achieved. 

A fifth embodiment of an image recognition 
apparatus will be described. The configuration is 
identical to that of the second embodiment (shown in 
Fig. 15). 

20 

The basic operation sequence of this embodiment is 
identical to that of the second embodiment, shown in 
Fig. 16. Steps 11, 12 and 13 are identical to those of 
the first embodiment. With this embodiment, the 
operation of step 20 of the flow diagram of Fig. 16 

25 

differs from that of the second embodiment, as follows. 
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In step 20, the pixel values are transformed from the 
RGB color space to coordinates of the cylindrical HSI 
color space shown in Fig- 20, using equation (9). 
Equation (13) below is then applied to transform the 
pixel values to the coordinates of a color space of the 
double-conical form shown in Fig. 28. 



h' (x, y) = h(x, y) 
s' (x, y) - 
1' (x, y) = l(x, y) 



| x(x, y) - max value / 2 | ] 

1 = — \s(x, yj 

max value / 2 J 



. (13) 



The equation (13) effects a transform of each set 
of coordinates of a pixel with respect to the 
cylindrical HSI space, i.e., h(x, y) , s(x, y) , i(x, y) 
to a corresponding set of hue, saturation and intensity 
coordinates of the double-conical color space of Fig. 
28, which will be designated as h'(x, y) , s f (x, y) , 
i f (x, y) respectively. This transform does not produce 
any change between h(x, y) and h'(x, y) , or between 

y) and i' (x, y) . Furthermore, if the value of 
i(x, y) is near the intensity value which is located 
midway between the maximum and minimum values of 
intensity (i.e., 1/2 of the white level value) there is 
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no difference between each value of s'(x, y) and s(x, 
y) . However as the value of i(x, y) becomes greater or 
smaller than the intermediate value, the value of s'(x, 
y) is accordingly reduced in relation to s(x, y) . 

5 

The operation of this embodiment for generating 
respective color vectors corresponding to the pixels of 
the color image is shown in more detail in the flow 
diagram of Fig. 30. This differs from the corresponding 
flow diagram of Fig. 11 for the first embodiment in 

10 

that the step 1002 of the first embodiment, for 
deriving the array of color vectors PV which are to be 
operated on by applying the edge templates in equation 
(2) as described above to obtain the edge vectors 
EVl(x,y) to EV2(x,), is divided into four steps, 1002a, 

15 

1002c, 1002g and 1002h. In step 1002a, the respective 
sets of r, g, b values for the object pixel and its 
eight adjacent surrounding pixels are obtained from the 
color image data storage section 1, and in step 1002c 
each of these sets of r, g, b values of the RGB color 

20 

space is converted to a corresponding set of h, s, i 
values of the cylindrical-shape HSI color space shown 
in Fig. 20. In step 1002g, each of these sets of h, s, 
i values is converted to a corresponding set of h 1 , s', 
i' values of the double-conical H'S'I' color space 

25 

shown in Fig. 28. In step 1002h, each of these sets 
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is converted to a corresponding set of three linear 
coordinates , i.e., of an orthogonal color space, by 
applying the processing of equation (13) . 

The remaining steps of this flow diagram, 

5 

which are omitted from Fig. 30, are identical to steps 
1003 to 1010 of Fig. 11. 

A specific example will be described in the 
following. In the same way as for the third 
embodiment, it will be assumed that the simplified 

10 

aerial photograph of Fig. 21 is the color image data 
that are to be subjected to recognition processing. 
Firstly, the RGB values of the pixels are converted to 
HSI values of the cylindrical HSI color space, and 
these are then transformed to H'S'I 1 values of the 

15 

double-conical color space. The first and third 
columns of values in Fig. 2 6 show the relationship 
between respective HSI values for each of the regions, 
and the corresponding H'S'I' values resulting from a 
transform into the coordinates of the double-conical 

20 

form of H'S'I' color space. 

The image recognition processing results obtained 
when this embodiment is applied to edge detection of 
the color image represented in Fig. 21 are as shown in 
Fig. 29. As can be seen, not only is the noise in the 

25 

low-intensity regions such as the building face 1 and 
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building face 2 of the image of Fig. 21 reduced, but 
noise is also greatly reduced in high-intensity regions 
such as the building roof and the street, with the 
shapes of the street and building being extracted as 

5 

indicated by numerals 56, 57 respectively. 

Thus with this embodiment, saturation values are 
reduce in regions of high or low intensity values, 
i.e., regions in which instability of saturation values 
can be expected to occur as a result of the transform 

10 

from the RGB to the HSI color space. Hence, the 
instability of saturation values can be substantially 
reduced, so that noise caused by these saturation 
values can be suppressed, and accurate edge detection 
can be achieved. 

15 

A sixth embodiment of an image recognition 
apparatus will be described. The configuration is 
identical to that of the second embodiment shown in 
Fig. 15, while the basic operation sequence is similar 
to that of the second embodiment, shown in the flow 

20 

diagram of Fig. 16. Steps 11, 12 and 13 are identical 
to those of the first embodiment, shown in the flow 
diagram of Fig. 5. Step 10 is basically similar to that 
of the fourth embodiment. 12. The step of performing 
the transform from the RGB color space to a different 

25 

color space (step 2 0 of Fig. 16) is executed as follows 



with this embodiment. Firstly, the transform of the 
pixel values from sets of r, g, b values of the RGB 
color space to h, s, i values of the cylindrical HSI 
color space of Fig. 2 0 is performed, using equation (9) 

5 

as described hereinabove for the preceding embodiment. 
With the sixth embodiment of the invention, the 
respective sets of h, s, i values derived for the 
pixels of the color image are then converted to 
coordinates of a modified H'S I 1 color space by 

10 

applying a saturation value modification function, 
which varies in accordance with the actual changes in 
the degree of sensitivity of the saturation values to 
small changes in intensity values. This function is 
generated and utilized as follows: 

15 

(1) The first step is to derive, for each of the 
possible values of intensity i, all of the sets of (r, 
g, b) values which will generate that value of i when 
the transform from the RGB to HSI color space is 
performed. That is, for each intensity value i(n), 

20 

where n is in the range from the minimum to maximum 
(e.g., 2 55) values, a corresponding group of sets of 
(r, g, b) values are derived. 

(2) For each intensity value, a corresponding set 
of values of a function which will be designated as 

25 

fl(r,g,b) are derived. These express, for each of the 



10 



78 



sets of (r, g, b) values, the amount of change which 
would occur in the corresponding value of saturation s, 
if the value of the red component r were to be altered 
in the range ±1. Each value of fl(r,g,b) is calculated 
as follows: 

fl(r,g,b) = \s(r+l,g,b) - s(r,g,b)\ if r = 0 

fl(r,g,b) = 

\s(r+l,g,b) - s(r,g,b) \ + \s(r,g,b) - s(r-l,g,b)\ 

if 0 < r < max_value 

fl(r,g,b) = \s(r,g,b) - s (r - l,g,b)\ if r - 0 

(14a) 

(3) Next, for each of the possible values of intensity 
i, the average of the corresponding set of values of 
fl(r,g,b) is obtained, i.e., a function of i is 
obtained which will be designated as f2(i). 
Designating the total number of sets of (r,g,b) values 
corresponding to a value of intensity i as k(i) , this 
can be expressed as: 

£ fl(r, g, b) 

. (*11 combinatl on* of r ,g ,h valumm %/hich rmmult in intmnmlty valu« ±) 

£ 2j j_y — _____ 

25 (14b) 
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where Zf2(r,g,b) signifies, for each value of i, the 
sum of all of the values obtained as f2(i) for that 
value of i, i.e., derived from all of the k sets of 
(r,g,b) value combinations which will result in that 
value of i when a transform from RGB to HSI coordinates 
is performed. 

(4) The required saturation value modification 
function f(i) is then obtained as follows, designating 
the minimum value obtained for f2(i) as min f2(i), and 
the maximum possible value of i as max_value: 



f(l) = ^ ±-tL±^ max value (14c) 

£2(1) 

The function f (i) is shown in Fig. 31. The higher 

i s 

the value of f (i) obtained from equation (14c) above, 
the greater will be the stability of the s values with 
respect to changes in the value of the red component r, 
and the function is derived on the assumption that 
such stability also corresponds to stability with 
respect to changes in the intensity component i. 
Conversely, the lower the value of f(i), the greater 
will be the degree of instability of s of with respect 
to changes in the value of r, and hence with respect to 
changes in the value of i. 
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That is to say, it is assumed that the values of 
saturation s will tend to be unstable in regions of the 
color image where the values of the red component r are 
high, and also in regions where the values of r are 

5 

low. Next, using equations (15) below, the respective 
sets of h, s, i values of the HSI cylindrical color 
space derived for the pixels of the color image are 
transformed into corresponding sets of coordinates 
h'jS'ji 1 of the modified cylindrical type of color 

10 

space shown in Fig. 32, by applying the function f (i) 
derived above. It can be understood that the shape of 
this modified cylindrical color space is formed by 
rotating the graph of the function f(i) shown in Fig. 
31 about its i-axis. 

15 

h' (x,y) = h(x,y) 



f ( (±(x, y) ) 

s' (x, y) = s(x, y 

maLX__ value 

i' (x,y) = i(x,y) 

(15) 

20 

The operation of this embodiment for generating 
respective color vectors corresponding to the pixels of 
the color image is shown in the partial flow diagram of 
Fig. 34. This differs from the corresponding flow 
25 diagram of Fig. 11 for the first embodiment in that the 



81 



step 1002 of the first embodiment, for deriving the 
array of color vectors PV is replaced by a series of 
four steps, 1002a, 1002c, 1002i and 1002 j. In step 
1002a, the respective sets of r, g, b values for the 

5 

object pixel and its eight adjacent surrounding pixels 
are obtained from the color image data storage section 
1, and in step 1002c each of these sets of r, g, b 
values of the RGB color space is converted to a 
corresponding set of h, s, i values of the cylindrical- 

10 

shape HSI color space shown in Fig. 20, In step 1002i, 
each of these sets of h, s, i values is converted to a 
corresponding set of h 1 , s 1 , i f values of the modified 
conical H'S'I' color space shown in Fig. 32, by 
applying equation (15). In step 1002 j, each of these 

15 

sets is converted to a corresponding set of three 
linear coordinates, i.e., of an orthogonal color space, 
while each of the resultant s'.cos h 1 and s'.sin h 1 
values is multiplied by the control parameter M a", as 
indicated by equation (12) . 

20 

The remaining steps of this flow diagram, which 
are omitted from Fig. 34, are identical to steps 1003 
to 1010 of Fig. 11. 

A specific example will be described in the 
following. In the same way as for the third embodiment, 

25 

it will be assumed that the simplified aerial 
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photograph of Fig- 21 constitutes the color image data 
that are to be subjected to recognition processing. 
With this embodiment, step 2 0 of Fig, 16, for 
conversion to a different color space, is executed as 

5 

follows. The RGB values of the pixels are converted to 
respective sets of h, s, i values of the cylindrical 
HSI color space of Fig. 20, and these are then 
transformed to h 1 , s f , i 1 coordinates of the modified 
cylindrical color space shown in Fig. 32, by applying 

10 

the aforementioned function f(i). The contents of the 
first and fourth columns of values in the table of Fig. 
26 show the relationship between respective HSI values 
for each of the regions of the color image of Fig. 21, 
and the corresponding H'S'I 1 values resulting from a 

15 

transform into the coordinates of the modified 
cylindrical color space. 

Fig. 3 3 shows the results of image recognition 
processing obtained when this embodiment is applied to 
the color image represented in Fig. 21. As shown, in 

20 

addition to reducing noise in regions of low intensity, 
such as the building face 1 and the building face 2, 
noise is greatly reduced in regions of high intensity 
such as the building roof and the road. In addition, 
the shapes of the road and building are very accurately 

25 

obtained, as indicated by numerals 58 and 59 



; ) 

83 

respectively , without any interruptions in the 
continuity of the edges. 

It can thus be understood that with this 
embodiment, when the color values of the image are 

5 

transformed from the RGB to respective sets of h, s, i 
values that are coordinates of an HSI color space, 
these coordinates are then modified by applying a 
predetermined function such that the intensity values 
are appropriately reduced in those regions of the image 

10 

where instability of the saturation values would 
otherwise occur. The function which is utilized for 
performing this modification of the intensity values is 
derived on the basis of calculating actual amounts of 
variation in saturation value that will occur in 

15 

response to specific small-scale changes in one of the 
r, g, or b values, for each point in the RGB color 
space. 

Hence, compensation of the intensity values is 
applied in an optimum manner, i.e. by appropriate 

20 

amounts, and only to those regions where instability of 
the saturation values would otherwise occur. This 
enables the generation of noise to be effectively 
suppressed, while at the same time enabling accurate 
detection of edges to be achieved, since the stability 

25 

of saturation values is achieved while ensuring that 
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the maximum possible amount of contribution to the 
magnitude of each color vector will be made by the 
corresponding set of h 1 , s 1 and i 1 values. That is to 
say, the maximum possible amount of color information 

5 

is used in the edge detection processing, consistent 
with stability of the saturation values and resultant 
elimination of noise from the edge detection results. 

A seventh embodiment of an image recognition 
apparatus is shown in Fig. 35. The apparatus is made 

10 

up of a region data storage section 4 having shape data 
which express only respective regions of an image, i.e. 
formed of labelled outlines of regions appearing in an 
image, such as are generated by the preceding 
embodiments) with that labelled image being referred to 

15 

in the following as a region image, an image 
recognition processing section 2 for performing image 
recognition of image data, and a combination-processed 
shape data storage section 5 for storing modified shape 
data which have been formed by the image recognition 

20 

processing section 2 through combining of certain ones 
of the regions expressed in the shape data held in the 
region data storage section 4. 

It should be understood that the term "image 

recognition" as applied herein to the operation of the 

25 ... 

image recognition processing section 2 signifies a form 
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of processing for recognizing certain regions within an 
image which should be combined with other regions of 
that image, and executing such processing. 

As shown in Fig, 3 5 the image recognition 

5 

processing section 2 is formed of a small region 
detection section 26, a combination object region 
determining section 27 and a region combination 
processing section 28. The small region detection 
section 2 6 performs selection of certain regions of the 

10 

image whose shape data are held in the region data 
storage section 4, based upon criteria described 
hereinafter. The combination object region determining 
section 27 determines those of the regions selected by 
the small region detection section 2 6 which are to be 

15 

mutually combined, and the region combination 
processing section 28 performs the actual combination 
of these regions. The combination object region 
determining section 27 includes a small region 
determining section, which compares the lengths of the 

20 

respective common border lines between a selected 
region and each of the regions which are immediately 
adjacent to that selected region, and determines the 
one of these adjacent regions which has the greatest 
length of common border line with respect to the 

25 

selected region. 



86 



Fig. 3 6 shows an example of a region image whose 
data are stored in the region data storage section 4, 
Labels such as "1" and "2" are attached to each of the 
pixels, as shown in the left side of Fig. 36. All of 

5 

the pixels located within a specific region have the 
same label, i.e., there is a region containing only- 
pixels having the label 1, a region containing only- 
pixels having the label 2, and so on. 

Various techniques are known for separating the 

10 

contents of an image into various regions. One method 
of defining a region is to select a pixel in the image, 
determine those immediately adjacent pixels whose color 
attributes are sufficiently close to those of the first 
pixel, within a predetermined range, and to 

15 

successively expand this process outwards, to thereby 
determine all of the pixels which constitute one 
region. Another method is to apply edge detection 
processing to the image, and to thereby define each 
region as a set of pixels which are enclosed within a 

20 

continuously extending edge. 

With this embodiment, there is no particular 
limitation on the process of generating the region 
image that is stored in the region data storage section 
4. 

25 
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The fundamental feature of the embodiment is that 
selected small regions, which constitute noise in the 
image that is stored in the region data storage section 
4, are combined with adjacent larger regions, or small 

5 

regions are mutually combined, to thereby eliminate the 
small regions and so reduce the level of noise in the 
region image. Two regions are combined by converting 
the pixel labels of one of the regions to become 
identical to the labels of the other region. The 

10 

resultant region data, which express the shapes of 
objects as respectively different regions, are then 
stored in the combination-processed shape data storage 
section 5. 

Fig. 37 is a flow diagram showing the basic 

15 

features of the operation of this embodiment. The 
contents are as follows. Step 70: a decision is made as 
to whether there is a set of one or more small regions 
within the image which each have an area which is 
smaller than s pixels, where s is a predetermined 

20 

threshold value. If such a region is found, then 
operation proceeds to step 71. If not, i.e., if it is 
judged that all small regions have been eliminated, 
then operation is ended. Step 71: a region r is 
arbitrarily selected, as the next small region that is 

25 

to be subjected to region combination, from among the 
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set of small regions which each have an area that is 
less than s pixels. Step 72: for each of the regions 

rl, r2, rn that are respectively immediately 

adjacent to the region r, the length of common boundary 

5 

between that adjacent region and the region r is 
calculated. Step 73: the region ri that is 

immediately adjacent to the region r and has the 
longest value of common boundary line with the region r 
is selected . Step 74: the regions r and ri are 

10 

combined to form a new region r*. 

A specific example will be described. It will be 
assumed that the region combination processing is to be 
applied to the region image that is shown in the upper 
part of Fig. 38. The image contains regions R, RI, R2 

15 

and R3. A vehicle 102 is represented by region R, 
while a street 100 is represented by the region RI. 
Since the area of the region R is less than s pixels, 
this region is to be deleted. 

There are two regions which are respectively 

20 

immediately adjacent to the region R, i.e., the regions 
RI and R2 . The respective lengths of common boundary 
line between these regions RI, R2 and the region R are 
obtained, and it is found that the length of common 
boundary line with respect to the region RI is longer 

25 

than that with respect to R2 . The region RI is 



therefore selected to be combined with the region R. R 
and Rl are then combined to form a new region, which is 
designated as Rl 1 , as shown in the lower part of Fig. 
38. In that way, the region representing a vehicle has 

5 

been removed from the region image whose data will be 
stored in the combination-processed shape data storage 
section 5. 

It can be understood that if the pixel values (of 
the original color image corresponding to the region 

10 

image) within the region R were close to those in the 
region R2, i.e., if these two regions were closely 
similar in color, and the regions R and R2 were to be 
combined on the basis of their closeness of color 
values, this would result in the street attaining an 

15 

unnatural shape. 

With the embodiment described above, a color image 
that has already been divided into regions is subjected 
to processing without consideration of the pixel values 
in the original color image, i.e., processing that is 

20 

based only upon the shapes of regions in the image, 
such as to combine certain regions which have a common 
boundary line. As a result, small regions which 
constitute noise can be removed, without lowering the 
accuracy of extracting shapes of objects which are to 

25 

be recognized. In particular, in the case of processing 
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image data of an aerial photograph of a city, it is 
possible to eliminate the shapes of vehicles on 
streets, without lowering the accuracy of extracting 
the shapes of the streets. 

An eighth embodiment of an image recognition 
apparatus will be described. The configuration is 
identical to that of the seventh embodiment (shown in 
Fig. 35) . 

The operation sequence of this eighth embodiment 
is shown in Fig. 40. This operation is basically 
similar to that of the seventh embodiment, shown in 
the flow chart of Fig. 37, with steps 70, 72, 73, 74 
being identical to those of the seventh embodiment, 
however the contents of step 71 are replaced by those 
of step 171 in Fig. 40. Specifically, in step 171 of 
this embodiment, the region r having the smallest area 
of all of the regions of the image which have an area 
of less than s pixels (as determined in step 70) is 
selected, and step 7 2 is then applied to that region r. 

A specific example will be described in the 
following. It will be assumed that the region image 
shown in the upper part of Fig. 39, representing a 
building 109 surrounded by a ground area, is to be 
subjected to combination processing for extracting only 
the shape of the building roof. There are four regions 
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in the image, Rl, R2 , R3 and R4 with R4 being the 
ground, R3 being a part of the roof of the building 109 
which is not covered by rooftop structures, and R.l, R2 
being respective regions corresponding to first and 

5 

second rooftop structures 110, 111 which are formed 
upon the roof of building 109. The areas of each of Rl 
and R2 is less than s pixels. Since Rl has the 
smallest area of all of the regions that are smaller 
than s pixels, as shown in the middle portion of Fig. 

10 

39, Rl and R3 are combined to obtain the region R3 1 . 
As a result, R2 becomes the region having the smallest 
area, of the regions R2 , R3 1 and R4. Hence, R2 and R3 1 
are combined, to generate a region R3" . Since the size 
of each of the remaining regions R3 11 and R4 is greater 

15 

than s pixels, the combining processing operation is 
then halted. 

In that way, the rooftop structures on the 
building are eliminated from the image, so that only 
the shape of the building itself will be extracted. 

20 

It should be noted that if this combining of 
regions had been executed in the sequence R2 , Rl, with 
R2 being combined with R4 and Rl being combined with 
R3 , it would be impossible to accurately extract the 
shape of the building. 

25 
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Thus with this embodiment, combining processing is 
repetitively applied to each of the regions that are 
below a predetermined size, such as to combine the 
region having the smallest area with another region. 

5 

As a result, small regions which constitute noise can 
be removed, without lowering the accuracy of extracting 
shapes for the purpose of object recognition. In 
particular, in the case of applying such processing to 
image data of an aerial photograph of a city, (i.e., in 

10 

which, as opposed to the usual type of housing, there 
will frequently be complex structures formed upon the 
roofs of buildings) this embodiment will enable the 
shapes of the buildings to be accurately extracted. 
A ninth embodiment of an image recognition 

15 

apparatus will be described. The configuration is 
identical to that of the seventh embodiment (shown in 
Fig. 35). 

The operation sequence of this ninth embodiment is 
shown in the flow diagram of Fig. 42. This is 

20 

basically similar to that of the seventh embodiment 
shown in the flow chart of Fig. 37, with steps 70, 72, 
73, 74 being identical to those of the seventh 
embodiment. However with this ninth embodiment, step 
71 of Fig. 37 is replaced by two successive steps 271a, 

25 

271b, executed as follows. 
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Step 271a: for each region having an area that is 
smaller than s pixels, where s is the aforementioned 
threshold value, the total of the areas of all of the 
immediately adjacent regions is obtained. 

Step 271b: the region r, for which the total of 
the areas of the immediately adjacent regions is a 
minimum, is selected to be processed in step 72, 

A specific example will be described in the 
following. It will be assumed that the region in the 
upper part of Fig. 41 is to be subjected to combination 
processing. There are four regions in the image, Rl, 
R2, R3 and R4 , with R4 being the surrounding ground, 
Rl and R2 are regions corresponding to first and second 
structures 112, 113 formed on the roof of building 109, 
and R3 is the region of that roof which is not covered 
by these structures. The area of each of Rl and R2 is 
less than s pixels. The aforementioned sums of areas 
of immediately adjacent regions are obtained as 
follows. The sum of the areas which are immediately 
adjacent to Rl is the total area of R2 and R3 , while 
the sum of such adjacent areas, in the case of R2 , is 
the total area of Rl, R3 and R4 . Of these two total 
areas of adjacent regions, the smaller of the two 
values is obtained for the case of region Rl. Thus, as 
shown in the middle part of Fig. 41, the regions R3 and 
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Rl are combined to form the region R3 1 . In the next 
repetition of step 71, it is found that there is only a 
single region which is smaller than s pixels, and that 
this is immediately adjacent to the regions R3 1 and R4. 

5 

Since R3 1 is the smaller of these adjacent regions, R3 
and R3 1 are combined to form a region R3". Since the 
size of that region is greater than s pixels, the 
combining processing operation is then halted. 

In that way, the structures on the building roof 

10 

having been eliminated, leaving only the outline of the 
building roof itself. 

It should be noted that if this combining of 
regions had been executed in the sequence R2 , Rl, with 
R2 being combined with R4 and Rl being combined with 

15 

R3, it would be impossible to accurately extract the 
shape of the building. 

Thus with this embodiment, combining processing is 
repetitively executed such as to combine the region 
which is below the threshold value of size (s pixels) 

20 

and for which the total area of the immediately 
adjacent regions is the smallest, with another region. 
As a result, small regions which constitute noise can 
be removed, without lowering the accuracy of extracting 
shapes for the purpose of object recognition. In 

25 

particular in the case of applying such processing, 
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whereby combining processing successively occurs from 
the interior of the outline of a building to the 
periphery of the building, to image data of an aerial 
photograph of a city in which there will be many 

5 

complex rooftop configurations, this embodiment will 
enable the shapes of the buildings to be accurately 
extracted. 

In the description of the preceding embodiments it 
has been assumed that the small region detection 

10 

section 2 6 shown in Fig. 5 determines the regions which 
are to be classified as part of the set of small 
regions (i.e., that are to be subjected to region 
combination processing) based upon whether or not the 
total area of a region is above a predetermined 

15 

threshold value (s pixels) . However' it should be noted 
that the invention is not limited to this method, and 
other types of criteria for selecting these small 
regions could be envisaged, depending upon the 
reguirements of a particular application. For example, 

20 

it might be predetermined that regions which are 
narrower than a predetermined limit are to be combined 
with other regions, irrespective of total area. It 
should thus be understood that various modifications to 
the embodiments described above could be envisaged, 

25 
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which fall within the scope claimed for the present 
invention. 

A tenth embodiment of an image recognition 
apparatus according to the present invention will be 
described. As shown in Fig. 43, this is formed of a 
color image data storage section 1 which stores color 
image data, an image recognition processing section 2 
for performing image recognition processing of the 
color image data, and a combination-processed shape 
data storage section 5 for storing shape data 
expressing a region image, extracted by the image 
recognition processing section 2. 

The image recognition processing section 2 of this 
embodiment is made up of a color space coordinates 
conversion section 25, color vector data generating 
section 21, edge template application section 22, 
edge strength and direction determining section 23, an 
edge pixel determining section 24 for extracting shape 
data expressing an edge image as described hereinabove 
referring to Fig. 16, a small region detection section 
26, a combination object region determining section 27, 
and a region combination processing section 28 for 
performing region combining processing as described 
hereinabove referring to Fig. 35, and an edge data - 
region data conversion section 29. 
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The color space coordinates conversion section 25 
converts the RGB data that are stored in the color 
image data storage section 1 to coordinates of an 
appropriate color space (i.e., whereby intensity and 
chrominance information are expressed respectively 
separately) . The color vector data generating section 
21 generates respective color vectors, each expressed 
by a plurality of scalar value, corresponding to the 
pixels of the original color image, from the 

10 

transformed image data. The edge template application 
section 22 applies edge templates to the pixel vector 
data, to generate edge vector data. The edge strength 
and direction determining section 23 determines the 
edge strength and the edge direction information, based 

15 

on the magnitudes of the edge vector moduli, as 
described hereinabove for the first embodiment, with 
the edge pixel determining section 24 determining those 
pixels which are located on edges within the color 
image, based on the edge strength and direction 

20 

information, to thereby obtain shaped data expressing 
an edge image. The edge data - region data conversion 
section 29 converts the edge image data into shape data 
expressing a region image. The small region detection 
section 26 selects a set of small regions which are 

25 

each to be subjected to region combination processing, 
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and the combination object region determining section 
27 determines the next one of that set of small regions 
that is to be subjected to the region combination 
processing. The combination object region determining 
section 27 operates on that small region, to determine 
the respective lengths of the common border lines 
between that small region and each of its immediately 
adjacent regions, and combines the small region with 
the adjacent region having the greatest length of 

10 

common border line with the small region. 

Fig. 44 is a flow diagram of the operating 
sequence of the apparatus of the embodiment of Fig. 10. 

The processing of the sequence of steps 20, 10, 
11, 12, and 13 is identical to that shown in Fig. 16 of 

15 

the second embodiment, described hereinabove, so that 
detailed description will be omitted. Similarly, the 
processing executed in the sequence of steps 70, 72, 
73, 74 is identical to shown in Fig. 37 for the 
seventh embodiment. In step 100, the data expressing 

20 

the edge image are converted to data expressing a 
region image. This is done by dividing the edge image 
into regions, each formed of a continuously extending 
set of pixels that are surrounded by edge pixels, and 
applying a common label to each of the pixels of such a 

25 

region as described hereinabove referring to Fig. 36, 
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i.e., applying respectively different labels to 
identify the various regions. 

A specific example will be described, assuming 
that the simplified aerial photograph which is 

5 

represented in the upper part of Fig. 45 is the color 
image whose data that are to be subjected to 
recognition processing by this embodiment. This image 
contains a road 122, two vehicles 121 and a building 
120. When edge detection is applied to this image, 

10 

using respective pluralities of scalar values of the 
pixels of the color image data, the results are as 
shown in the middle part of Fig. 45. As shown, edge 
data are detected for the road, the vehicles and the 
building, respectively, so that the shapes 12 3 of the 

15 

vehicles appear in the street. The data of that edge 
image are then converted to data of a region image as 
described above, and region combining is applied based 
upon the shapes of the regions, without consideration 
of the values of pixels within the regions. The result 

20 

obtained is as shown in the lower part of Fig. 45. As 
shown, the vehicles have been eliminated, leaving the 
shape 124 of the road accurately represented. 

The upper part of Fig. 46 shows an edge image that 
has been obtained by applying edge detection by an 

25 

embodiment of the present invention to a color image 
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which is an actual aerial photograph containing various 
roads and buildings and many vehicles. Numeral 13 0 
indicates various small regions appearing in the edge 
image which correspond to the outlines of respective 
vehicles, while the larger rectangular regions 
designated by numeral 131 correspond to buildings. In 
the original photograph there is almost no difference 
in intensity between the building roofs and the 
surrounding ground surface. Hence, if prior art 

10 

methods of image recognition were to be applied in this 
instance, it would be difficult to detect the shapes of 
the edges of the buildings. However by applying the 
present invention, the building edges are accurately 
detected. 

15 

The edge image is then converted to a region 
image, and region combination is applied to that region 
image as described above, i.e., with the combination 
processing being based upon the shapes of the regions, 
without consideration of the values of pixels within 

20 

the regions, and with the aforementioned threshold 
value s being set to an appropriate value for 
substantially eliminating the small regions 130 which 
correspond to vehicles. 

The result obtained is as shown in the lower part 

25 

of Fig. 46. As shown, the shapes of many vehicles have 
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been eliminated, thereby enabling the buildings to be 
more easily recognized, without reducing the accuracy 
of extracting the shapes of the buildings. 

As can be understood from the above description of 

5 

embodiments, according to one basic aspect, the present 
invention provides an image recognition method and 
image recognition apparatus whereby the edges of 
regions expressing objects appearing in a color image 
can be accurately and reliably detected* This is based 

10 

upon expressing the color attributes of each pixel of 
the image as a plurality of scalar values expressing a 
color vector, and the use of edge vectors corresponding 
to respective ones of a plurality of predetermined edge 
directions (i.e., specific orientation angles within an 

15 

image) . The pixels of the color image are selective 
processed to derive a corresponding set of edge 
vectors, with each edge vector being a vector quantity 
which is indicative of an amount of variation in color 
between pixels which are located on opposite sides of a 

20 

line extending through the selected pixel and extending 
in the corresponding edge direction. Each edge vector 
is derived in a simple manner by performing an array 
multiplication operation between an edge template and 
an array of color vectors centered on the selected 

25 

pixel, and obtaining the vector sum of the result. 
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With the described embodiments, this operation is 
equivalent to selecting first and second sets of pixels 
that are located on respectively opposing sides of the 
selected pixel, with respect to a specific edge 

5 

direction, obtaining respective weighted vector sums of 
the color vectors of these two sets, and obtaining the 
vector difference between these sums. The edge 
direction corresponding to the edge vector having the 
largest modulus of the resultant set of edge vectors 

10 

obtained for the selected pixel (that largest value 
being referred to as the edge strength) is thereby 
obtained as the most probable edge direction on which 
that pixel is located, and it thereby becomes possible 
to reliably detect those pixels which actually are 

15 

located on edges, based on comparisons of respective 
values of edge strength of adjacent pixels, and also to 
obtain the direction of such an edge. 

According to a second basic aspect of the 
invention, a region image which expresses an image as a 

20 

plurality of respectively identified regions can be 
processed to eliminate specific small regions which are 
not intended to be identified, and which therefore 
constitute noise with respect to an image recognition 
function. This is achieved by first detecting the set 

25 

of small regions which are each to be eliminated by 
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being combined with an adjacent region , then 
determining the next one of that set which is to be 
subjected to the combination processing, with that 
determination being based upon specific criteria which 
are designed to prevent the combination of the small 
regions having the effect of distorting the shapes of 
larger regions which are to be recognized. The small 
region thus determined is then combined with an 
adjacent region, with that adjacent region also being 
selected such as to reduce the possibility of 
distortion of regions which are intended to be 
recognized. In that way, the disadvantages of prior 
art methods of reducing such small regions, such as by 
various forms of filter processing, can thereby be 
effectively overcome. 



